HTML and CSS are the basics when it comes to build a website. While HTML shows the structure of the page, a CSS stylesheet allows to determinate graphical properties to the HTML elements. A blue background, a bold font or even the spacing between two paragraphs are defined by CSS.

To style an element, or a series of elements, CSS use selectors that allow to describe what elements we want to apply properties to.

Let’s start with a simple HTML code:

<body>
   <h1>Hello all</h1>
</body>

In our stylesheet, we can apply some style to the title by writing something like

h1 { color: blue; }

h1 indicates the path of the element and what’s inside {} the properties.

with ImportFromWeb, instead of styling a page, we want to extract the data out of it. Hence we only need to define the right paths for the elements we need to extract.

Study the HTML code of any page

Your first reflex should be to open the Developer tool in the browser you use. At least Chrome, Safari and Firefox have one. It usually opens with F12.

You’ll notice the button on the top left corner. Click on it, and pass the mouse through the content you want to extract.
It will highlight the related element in source code. Pretty handy, isn’t it?

Let the browser decide the path for you

Now right click on the highlighted element and choose copy > selector. You now have it in your clipboard and can easily use it in your Google Sheets IMPORTFROMWEB function.

Example:

=IMPORTFROMWEB("https://www.imdb.com/chart/moviemeter", "main > div > span > div > div > div.lister > table > tbody > tr:nth-child(7) > td.titleColumn")

The cons of this is that the path is usually complex and depends on the parent elements. Webpages tend to change often and it’s not necessarily a good idea to depend on many elements as these elements may not exist anymore in the future.

Define your own CSS Path

It’s actually easy to create CSS Selectors if you have a basic knowledge of HTML.

Here is a table that shows the basics and some interesting combinations

SelectorElement selected
.titleColumn<div class=”titleColumn”> Title </div>
#title1<div id=”title1″> Title </div>
h1<h1 class=”…”> Title </h1>
div:first-child<div class=”one”>Tiger</div>
<div class=”two”>Horse</div>
<div class=”three”>Elephant</div>
div<div class=”one”>Table</div>
<div class=”two”>Chair</div>
<p class=”three”>Stool</p>
div:nth-child(2)<div class=”one”>Amazon</div>
<div class=”two”>Google</div>
<div class=”three”>Twitter</div>
div:last-child<div class=”one”>Amazon</div>
<div class=”two”>Google</div>
<div class=”three”>Twitter</div>
div h1<div>
<p>
<h1>The title</h1
</p>
</div>
div > p > h1<div>
<p>
<h1>The title</h1
</p>
</div>
div[role=”heading”]<div role=”heading”>Hey</div>
a/href<a href=”https://google.com“>
div.highlight<p class=”highlight”> … </p>
<div class=”highlight”>Breaking news</div>

Ready to try it out?

Install ImportFromWeb and start extracting web data in Google Sheets