In your =IMPORTFROMWEB( ) functions, the data selectors describe the content you want to import.
There are 4 types of selectors:
- Generic selectors
- Built-in selectors, i.e. selectors preconfigured for some specific websites
- CSS selectors
The ImportFromWeb functions accept a single selector or a reference to a range of selectors.
To optimize your ImportFromWeb usage, we recommend you to input all your data selectors within the same function, rather than creating a new function for each data selector.
Those selectors are universal and will work with most websites. They describe some basic content any webpage has.
|title||the page title|
|h1||the main headings|
|table||the first table|
|table/x||the x table of the page (replace x by your chosen value)|
|emails||all the emails contained into the page|
|linkUrls||the link urls|
|linkAnchors||the link anchors (i.e. the clickable text of the link)|
|metaTitle||the title form the <meta> tag|
|metaDescription||the descriptions in the <meta> tag|
|imageSources||the image sources (URL)|
|imageAlternatives||the image descriptions form the image alt attributes|
|metaKeywords||the keywords in the <meta> tag|
|metaTwitter id||the id of the related Twitter account|
=IMPORTFROMWEB("https://website.com", "title") will import the page title of your URL
With generic and built-in selectors, you can simply separate the terms by a comma. Beware of keeping the quotes around the whole set:
=IMPORTFROMWEB("https://website.com", "metaTitle,metaDescription") will import the Meta Title and Meta Description of your URL
For some mainstream websites, data can be extracted using built-in selectors, i.e. labels we have pre-configured.
For example, you can extract prices from Amazon using the selector “sale_price” or phone number from Google Maps using the selector “phone_number”
Currently available platforms:
- Amazon product pages
- Amazon reviews
- Amazon search pages
- Google Search
- Wallmart product pages
- Google Maps places
- Yahoo Finance
- and much more to come !
Feel free to reach out to us to order your specific selectors for the website you want to scrape data from
Work with XPaths
XPath (XML Path Language) is a query language for selecting nodes from an XML document. They can be used to describe the location of any element on a webpage.
As XPaths are usually a long string of keys, they can look a bit scary.
However, with the help of your Web browser, it can be very easy to find the right XPath for the piece of data you are looking for.
Find out more in this article: Find an xPath with limited HTML knowledge
Work with CSS Selectors
CSS selectors are less powerful than XPaths to find complex elements in HTML pages, but they usually do the job and are also easier to build.
Select values in attributes
Standard CSS Selectors don’t allow to load values contained in attributes. With ImportFromWeb it is possible by indicating
/attributeName just after the element.
img/src should return the source of the image.
Select an occurence
The CSS notation
:nth-of-type() that allows to select an occurence of a specific element can be written like
Find out more in this article: Introduction to CSS selectors