A Data-Driven Exploration of GuitarsUnited.com

Our Twitter Profile was recently followed by an account belonging to GuitarsUnited. As we are always looking for new opportunities to analyze data and produce visualizations (and we happen to love musical instruments), we saw this as an opportunity to put some of our R, webscraping, and data visualization skills to the test.

Though a bit limited in terms of data, the website does provide a nice scrapable grid structure of both categories and products. The main page displays the categories like so:

Clicking on a category yields a very similar (and equally scrapable) structure for the products within a category:

Lastly, we must scrape the data from the Repairs Blog, which is in a structured format like so:

After scraping the data on GuitarsUnited products and repairs, we can read them in and do perform some initial data processing in order to define outlier products (based on price) and remove Gift Certificates as a considered product category:

The two sets of data we scraped are on the product (or instruments) level, and on the repair level (as documented by the GuitarsUnited.com blog). Let’s take a peek at a subset of each of them:

Title

 Product Link

 Category

 Category Link

 Price

Epiphone John Lennon Vintage Sunburst EJ-160E https://www.guitarsunited.com/shop/acoustic-electric-guitars/epiphone-john-lennon-vintage-sunburst-ej-160e/ Acoustic Electric Guitars https://www.guitarsunited.com/product-category/acoustic-electric-guitars/ 349.99
Martin Custom Road Series Acoustic Electric Dreadnought https://www.guitarsunited.com/shop/acoustic-electric-guitars/martin-custom-road-series-acoustic-electric-dreadnought/ Acoustic Electric Guitars https://www.guitarsunited.com/product-category/acoustic-electric-guitars/ 649.99
Ovation Acoustic Electric CC-54i https://www.guitarsunited.com/shop/acoustic-electric-guitars/ovation-acoustic-electric-cc-54i/ Acoustic Electric Guitars https://www.guitarsunited.com/product-category/acoustic-electric-guitars/ 399.99

The products data includes information on the product category, the price, as well as a text-based description of the product (which we’ve supressed for readability).

Repair Title

 Repair Text

Published

 Repair Tags

1981 Ibanez Blazer Custom Bass In for a setup… #Japan #luthier

30.1379607-81.6325861

June 9, 2018 80’s, Amazing, ash, BA, bass, Classic, cool, custom, ibanez, Japan, maple, music, musician, pimp, quality, rare, rock, setup, strings, swank, Sweet, thumper, Tone, vintage, wood
Taylor 614CE Complete Expression pickup system replacement Do you have a old Taylor that needs a new pickup? We can help!

30.1378374-81.6329702

April 6, 2018 Acoustic, active, ebony, luthier, maple, setup, Solid, spruce, taylor guitars, Tone, USA, wood

The repairs data includes the title, further text if available, the date of publication, and a set of tags associated with each repair.

Instruments

With the data ready, let’s begin our exploration by simply looking at the distribution of products by category. We can easily do this by performing some dplyr routines to arrange the products in descending order of the product count per category and then using the transformed data to produce a bar chart of product distribution.

Based on the namesake, we see the not-too-surprising result that GuitarsUnited does in fact have primarily guitars offered for purchase. Interestingly, they also offer Ukuleles, Amps, and even a single banjo.

Next we can analyze the overall price distribution across products:

The majority of products are less than $1000, but a few outliers exist – products that are in the range of $3000-$4000 and even $5999 dollars! Let’s look at distributions by category to help get a feel for which category tends to be the most expensive. We will also highlight the outliers, to know which are the most valuable instruments within these categories.

We see that there is a vintage amp priced at $4000, along with a few electric guitars above $3000. The price distribution for Electric and Acoustic guitars is much more variable compared to Ukuleles and Bass Guitars. The “Gibson J-45 Dreadnought” from 1964 was the biggest outlier with a price of $5999!

Finally, let’s produce a wordcloud of product descriptions for all of these instruments:

The descriptions highlight several words associated with guitars, including body, sound, maple, and mahogany. The word “custom” pops up a few times to let you know that they do handle specialty arrangements.

Repairs

Next, let’s turn our attention to analyzing the text from the blogs of repairs done for various guitars and other instruments! First, let’s look at the number of repairs over time by month:

We can see that the repairs really spiked in Spring 2017, but have declined as of late. More interesting would be to see how the type of repair has changed over time. In order to do that, we perform some word stemming on the titles of the blogs posted:

Next, we plot the number of repairs over time by type:

This plot is a little bit unwieldy, but we can immediately see that the repairs spiked in 2017, and restringing is the most common type of repair. When we aggregate the data at a yearly level this pattern is visible even more clearly:

All repair types increased in 2017, but the dramatic increase for restringing was most apparent. In fact, thus far in 2018 it is the only repair that has been documented on the site. Let’s go back to monthly aggregation and then visualize the data in a line chart since January 2017:

This really enhances the dramatic increase in Spring 2017, followed by the relative decline in all types of repairs late in 2017.

Data is everywhere, and sometimes a data analysis can be inspired by the smallest of actions – in this case, being followed by a Twitter account focused on a subject that we’re interested in! We hope you enjoyed this brief analysis, and if you want to follow the code and/or recreate this analysis, contact us at services@omnianalytics.io.
You can also check out our R programming course, where we go through dplyr and data manipulation and visualization steps, here.

Posted on September 20, 2018 in Case Studies

Share the Story

Lawrence Mosley

About the Author

Leave a reply

Your email address will not be published.

Back to Top