HTML and CSS are the basics when it comes to build a website. While HTML shows the structure of the page, a CSS stylesheet allows to determinate graphical properties to the HTML elements. A blue background, a bold font or even the spacing between two paragraphs are defined by CSS.
With ImportFromWeb, we use CSS for a slightly different goal: Tell the machine what data to look for on a web page.
You can use other ways to retrieve web data with ImportFromWeb:
- Start with our ready-to-use solutions
- Use prebuilt selectors
- Use XPaths, which offer more possibilities compared to CSS
To style an element, or a series of elements, CSS use selectors that allow to describe what elements we want to apply properties to.
Let’s start with a simple HTML code:
<body> <h1>Hello all</h1> </body>
In our stylesheet, we can apply some style to the title by writing something like
h1 { color: blue; }
h1 indicates the path of the element and what’s inside {} the properties.
Using IMPORTFROMWEB:
- in our Google Sheet, Let’s paste the few lines of the above code in cell A1
- In A3, let’s type
=IMPORTFROMWEB( A1, "h1")
The expected output in A3 will be :
Hello all
How to find the CSS path for any web data
Your first reflex should be to open the Developer tool in the browser you use. At least Chrome, Safari and Firefox have one. It usually opens with F12.
You’ll notice the button on the top left corner. Click on it, and pass the mouse through the content you want to extract.
It will highlight the related element in source code. Pretty handy, isn’t it?
Let the browser decide the path for you
Now right click on the highlighted element and choose copy > selector. You now have it in your clipboard and can easily use it in your Google Sheets IMPORTFROMWEB function.
Example:
=IMPORTFROMWEB("https://www.imdb.com/chart/moviemeter", "main > div > span > div > div > div.lister > table > tbody > tr:nth-child(7) > td.titleColumn")
The cons is that the path is usually complex and depends on the parent elements. Webpages tend to change often and it’s not necessarily a good idea to depend too much on other elements as these elements may not exist anymore in the future.
Define your own CSS Path
It’s actually easy to create CSS Selectors if you have a basic knowledge of HTML.
Here is a table that shows the basics and some interesting combinations
Selector | Element selected |
---|---|
.titleColumn | <div class=”titleColumn”> Title </div> |
#title1 | <div id=”title1″> Title </div> |
h1 | <h1 class=”…”> Title </h1> |
div:first-child | <div class=”one”>Tiger</div> <div class=”two”>Horse</div> <div class=”three”>Elephant</div> |
div | <div class=”one”>Table</div> <div class=”two”>Chair</div> <p class=”three”>Stool</p> |
div:nth-child(2) | <div class=”one”>Amazon</div> <div class=”two”>Google</div> <div class=”three”>Twitter</div> |
div:last-child | <div class=”one”>Amazon</div> <div class=”two”>Google</div> <div class=”three”>Twitter</div> |
div h1 | <div> <p> <h1>The title</h1 </p> </div> |
div > p > h1 | <div> <p> <h1>The title</h1 </p> </div> |
div[role=”heading”] | <div role=”heading”>Hey</div> |
a/href | <a href=”https://google.com“> |
div.highlight | <p class=”highlight”> … </p> <div class=”highlight”>Breaking news</div> |
Ready to try it out?
Install ImportFromWeb and start extracting web data in Google Sheets.