In your =IMPORTFROMWEB( ) functions, the data selectors describe the content you want to import.
There are 4 types of selectors:
- Generic selectors
- Built-in selectors, i.e. selectors preconfigured for some specific websites
- XPaths
- CSS selectors
Examples:
=IMPORTFROMWEB("https://www.example.com", "//a[@id='artist_link']")
=IMPORTFROMWEB("https://www.example.com", A1)
The =IMPORTFROMWEB( ) functions accept a single selector or a reference to a range of selectors.
In case of multiple selectors, we recommend you to input all your data selectors within the same function, rather than creating a new function for each data selector.
=IMPORTFROMWEB("https://www.example.com", A1:K1)
Generic selectors
Those selectors are universal and will work with most websites. They describe some basic content any webpage has.
Selector | Definition |
---|---|
title | the page title |
h1 | the main headings |
h2 | secondary headings |
table | the first table |
table/x | the x table of the page (replace x by your chosen value) |
normalize-space(//body) | the whole content of the page contained into the <body> |
emails | all the emails contained into the page |
linkUrls | the link urls |
linkAnchors | the link anchors (i.e. the clickable text of the link) |
metaTitle | the title form the <meta> tag |
metaDescription | the descriptions in the <meta> tag |
imageSources | the image sources (URL) |
imageAlternatives | the image descriptions form the image alt attributes |
metaKeywords | the keywords in the <meta> tag |
metaTwitter id | the id of the related Twitter account |
Examples:
=IMPORTFROMWEB("https://website.com", "title") will import the page title of your URLWith generic and built-in selectors, you can simply separate the terms by a comma. Beware of keeping the quotes around the whole set:
=IMPORTFROMWEB("https://website.com", "metaTitle,metaDescription") will import the Meta Title and Meta Description of your URL
Built-in selectors
For some mainstream websites, data can be extracted using built-in selectors, i.e. labels we have pre-configured.
For example, you can extract prices from Amazon using the selector “sale_price” or phone number from Google Maps using the selector “phone_number”
Currently available platforms:
- Amazon product pages
- Amazon reviews
- Amazon search pages
- Google Search
- Walmart product pages
- Google Maps places
- YouTube video selectors
- Yahoo Finance
- and much more to come !
Feel free to reach out to us to order your specific selectors for the website you want to scrape data from
XPaths
XPath (XML Path Language) is a query language for selecting nodes from an XML document. They can be used to describe the location of any element on a webpage.
As XPaths are usually a long string of keys, they can look a bit scary.
However, with the help of your Web browser, it can be very easy to find the right XPath for the piece of data you are looking for.
Find out more in this article: Find an xPath with limited HTML knowledge
CSS Selectors
CSS selectors are less powerful than XPaths to find complex elements in HTML pages, but they usually do the job and are also easier to build.
Select values in attributes
Standard CSS Selectors don’t allow to load values contained in attributes. With ImportFromWeb it is possible by indicating /attributeName
just after the element.
For example img/src
should return the source of the image.
Select an occurence
The CSS notation :nth-of-type()
that allows to select an occurence of a specific element can be written like /n
Example: table/2
Find out more in this article: Introduction to CSS selectors