⭠ all tutorials

Slack App That Scrapes Websites for Data

Available for Unlimited All Access members only

Get access

In this guide, we will learn to retrieve and send scraped data into Slack. We’ll quickly set up a Slack app that scrapes websites for links using a slash command and posts the results inside a Slack channel like this:

Once you deploy this Slack App live, you can return and modify the code to add additional logic, scrape different data.

Use Case

📰 Scrape data from news websites, share, compare and discuss inside a Slack channel.

💌 Scrape domains for email addresses and apply enrichment with Clearbit APIs - share with your Sales team inside Slack

☎️ Quickly Gather phone numbers and assist your SDR team in call calling campaigns inside Slack

🔎Assist your marketing team by pulling data from forums and social media to perform sentiment analysis all within Slack.

Please remember to respect the policies around web crawlers of any sites you scrape.

Table of Contents:

  • Install from Github
  • Test Your Slack App Website Scraper
  • Making Changes
  • Support
  • Acknowledgments

Install from Github

Head on over to Github to fork my project’s code 👉🏼https://github.com/JanethL/SlackAppWebscraper/blob/master/README.md

Click the Open in Autocode button. You will be prompted to sign in or create a FREE account. If you have a Standard Library account click Already Registered and sign in using your Standard Library credentials.

Give your project a unique name and select Start API Project from Github

Autocode automatically sets up a project scaffold to save your project as an API endpoint, but it hasn’t been deployed.

To deploy your API to the cloud navigate through the functions/events/slack/command/ folders and select scrape.js file.

Select the 1 Account Required red button which will prompt you to link a Slack account.

If you’ve built Slack apps with Standard Library, you’ll see existing Slack accounts, or you can select Link New Resource to link a new Slack app.

Select Install Standard Library App.

You should see an OAuth popup that looks something like this:

Select Allow. You’ll have the option to customize your Slack app with a name and image.

Select Finish. The green checkmarks confirm that you’ve linked your accounts correctly. Click Finished Linking.

To deploy your API to the cloud select Deploy API in the bottom-left of the file manager.

🙌 Test Your Slack App Website Scraper

You’re all done. Try it out! Your Slack App is now available for use in the Slack workspace you authorized it for.

Your Slack app should respond to:
/cmd scrape <url> <selector> as I show in the screenshot:

I’ve included an additional command as a cheatsheet and list a few websites and their selectors to retrieve links.

Just type /cmd list and you should see your app respond with the following message.

Or review the previous tutorial to learn how to scrape using css selectors.

How It Works

When you submit /cmd scrape https://techcrunch.com/ a.post-block__title__link (or any URL followed by its respective selector) in Slack’s message box, a webhook will be triggered. The webhook, built and hosted on Standard Library, will first make a request to crawler.api, which will return a JSON payload with results from the query.

Our webhook will then create Slack messages for each event and post those to the channel where the command was invoked.

const lib = require('lib')({token: process.env.STDLIB_SECRET_TOKEN});
/**
* An HTTP endpoint that acts as a webhook for Slack command event
* @param {object} event
* @returns {object} result Your return value
*/
module.exports = async (event) => {
 // Store API Responses
 const result = {slack: {}, crawler: {}};
 
 if ((event.text || '').split(/\s+/).length != 2) {
   return lib.slack.channels['@0.6.6'].messages.create({
     channel: `#${event.channel_id}`,
     text: `${event.text} has wrong format. `
   });
 }
 
 console.log(`Running [Slack → Retrieve Channel, DM, or Group DM by id]...`);
 result.slack.channel = await lib.slack.conversations['@0.2.5'].info({
     id: `${event.channel_id}`
 });
 console.log(`Running [Slack → Retrieve a User]...`);
 result.slack.user = await lib.slack.users['@0.3.32'].retrieve({
     user: `${event.user_id}`
 });
 
 console.log(`Running [Crawler → Query (scrape) a provided URL based on CSS selectors]...`);
 result.crawler.pageData = await lib.crawler.query['@0.0.1'].selectors({
     url: event.text.split(/\s+/)[0],
     userAgent: `stdlib/crawler/query`,
     includeMetadata: false,
     selectorQueries: [
         {
             'selector': event.text.split(/\s+/)[1],
             'resolver': `attr`,
             'attr': 'href'
         }
     ]
 });
 let text = `Here are the links that we found for ${event.text.split(/\s+/)[0]}\n \n ${result.crawler.pageData.queryResults[0].map((r) => {
   if (r.attr.startsWith('http://') || r.attr.startsWith('https://') || r.attr.startsWith('//')) {
       return r.attr;
   } else {
       return result.crawler.pageData.url + r.attr;
   }
 }).join(' \n ')}`;
 console.log(`Running [Slack → Send a Message from your Bot to a Channel]...`);
 result.slack.response = await lib.slack.channels['@0.6.6'].messages.create({
   channel: `#${event.channel_id}`,
   text: text
 })
 return result;
};

The first line of code imports an NPM package called “lib” to allow us to communicate with other APIs on top of Standard Library:

const lib = require(‘lib’)({token: process.env.STDLIB_SECRET_TOKEN});

Lines 2–6 is a comment that serves as documentation and allows Standard Library to type check calls to our functions. If a call does not supply a parameter with a correct (or expected type) it would return an error.

Line 7 is a function (module.exports) that will export our entire code found in lines 8–54. Once we deploy our code, this function will be wrapped into an HTTP endpoint (API endpoint) and it’ll automatically register with Slack so that every time a Slack command event happens, Slack will send the event payload for our API endpoint to consume.

Lines 11–16 is an if statement that handles improper inputs and posts a message to Slack using lib.slack.channels['@0.6.6'].messages.create.

Lines 18–21 makes an HTTP GET request to the lib.slack.conversations[‘@0.2.5’] API and uses the info method to retrieve the channel object which has info about the channel including name, topic, purpose etc and stores it in result.slack.channel.

Lines 22–25 also makes an HTTP GET request to lib.slack.users[‘@0.3.32’] and uses the retrieve method to get the user object which has info about the user and stores it in result.slack.user.

Lines 27–39 is making an HTTP GET request to lib.crawler.query['@0.0.1'] and passes in inputs from when a Slack command event is invoked. For the url we pass in the first input from our Slack event event.text.split(/\s+/)[0].

userAgent is set to the default: stdlib/crawler/query

includeMetadata is False (if True, will return additional metadata in a meta field in the response)

selectorQueries is an array with one object, the values being {selector:event.text.split(/\s+/)[1],resolver':'attr, attr: href}

For selector we retrieve the second input from the Slack event using event.text.split(/\s+/)[1].

Lines 40–53 creates and posts your message using the parameters that are passed in: channelId, Text.

You can read more about API specifications and parameters here: https://docs.stdlib.com/connector-apis/building-an-api/api-specification/

Making Changes

Now that your app is live, you can return at any time to add additional logic and scrape websites for data with crawler.api.

There are two ways to modify your application. The first is via our in-browser editor, Autocode. The second is via the Standard Library CLI.

via Web Browser

Simply visit Autocode.com and select your project. You can easily make updates and changes this way, save your changes and deploy directly from your browser.

Shipping to Production

Standard Library has easy dev/prod environment management, if you’d like to ship to production, visit build.stdlib.com, find your project and select manage.

From the environment management screen, simply click Ship Release.

Link any necessary resources, specify the version of the release and click Create Release to proceed.

That’s all you need to do!

Support

Via Slack: libdev.slack.com

You can request an invitation by clicking Community > Slack in the top bar on https://stdlib.com.

Via Twitter: @Sandard Library

Via E-mail: support@stdlib.com

The full tutorial is available for pro members only

Request access

In this guide, we will learn to retrieve and send scraped data into Slack. We’ll quickly set up a Slack app that scrapes websites for links using a slash command and posts the results inside a Slack channel like this:

Once you deploy this Slack App live, you can return and modify the code to add additional logic, scrape different data.

Use Case

📰 Scrape data from news websites, share, compare and discuss inside a Slack channel.

💌 Scrape domains for email addresses and apply enrichment with Clearbit APIs - share with your Sales team inside Slack

☎️ Quickly Gather phone numbers and assist your SDR team in call calling campaigns inside Slack

🔎Assist your marketing team by pulling data from forums and social media to perform sentiment analysis all within Slack.

Please remember to respect the policies around web crawlers of any sites you scrape.

Table of Contents:

  • Install from Github
  • Test Your Slack App Website Scraper
  • Making Changes
  • Support
  • Acknowledgments

Install from Github

Head on over to Github to fork my project’s code 👉🏼https://github.com/JanethL/SlackAppWebscraper/blob/master/README.md

Click the Open in Autocode button. You will be prompted to sign in or create a FREE account. If you have a Standard Library account click Already Registered and sign in using your Standard Library credentials.

Give your project a unique name and select Start API Project from Github

Autocode automatically sets up a project scaffold to save your project as an API endpoint, but it hasn’t been deployed.

To deploy your API to the cloud navigate through the functions/events/slack/command/ folders and select scrape.js file.

Select the 1 Account Required red button which will prompt you to link a Slack account.

If you’ve built Slack apps with Standard Library, you’ll see existing Slack accounts, or you can select Link New Resource to link a new Slack app.

Select Install Standard Library App.

You should see an OAuth popup that looks something like this:

Select Allow. You’ll have the option to customize your Slack app with a name and image.

Select Finish. The green checkmarks confirm that you’ve linked your accounts correctly. Click Finished Linking.

To deploy your API to the cloud select Deploy API in the bottom-left of the file manager.

🙌 Test Your Slack App Website Scraper

You’re all done. Try it out! Your Slack App is now available for use in the Slack workspace you authorized it for.

Your Slack app should respond to:
/cmd scrape <url> <selector> as I show in the screenshot:

I’ve included an additional command as a cheatsheet and list a few websites and their selectors to retrieve links.

Just type /cmd list and you should see your app respond with the following message.

Or review the previous tutorial to learn how to scrape using css selectors.

How It Works

When you submit /cmd scrape https://techcrunch.com/ a.post-block__title__link (or any URL followed by its respective selector) in Slack’s message box, a webhook will be triggered. The webhook, built and hosted on Standard Library, will first make a request to crawler.api, which will return a JSON payload with results from the query.

Our webhook will then create Slack messages for each event and post those to the channel where the command was invoked.

const lib = require('lib')({token: process.env.STDLIB_SECRET_TOKEN});
/**
* An HTTP endpoint that acts as a webhook for Slack command event
* @param {object} event
* @returns {object} result Your return value
*/
module.exports = async (event) => {
 // Store API Responses
 const result = {slack: {}, crawler: {}};
 
 if ((event.text || '').split(/\s+/).length != 2) {
   return lib.slack.channels['@0.6.6'].messages.create({
     channel: `#${event.channel_id}`,
     text: `${event.text} has wrong format. `
   });
 }
 
 console.log(`Running [Slack → Retrieve Channel, DM, or Group DM by id]...`);
 result.slack.channel = await lib.slack.conversations['@0.2.5'].info({
     id: `${event.channel_id}`
 });
 console.log(`Running [Slack → Retrieve a User]...`);
 result.slack.user = await lib.slack.users['@0.3.32'].retrieve({
     user: `${event.user_id}`
 });
 
 console.log(`Running [Crawler → Query (scrape) a provided URL based on CSS selectors]...`);
 result.crawler.pageData = await lib.crawler.query['@0.0.1'].selectors({
     url: event.text.split(/\s+/)[0],
     userAgent: `stdlib/crawler/query`,
     includeMetadata: false,
     selectorQueries: [
         {
             'selector': event.text.split(/\s+/)[1],
             'resolver': `attr`,
             'attr': 'href'
         }
     ]
 });
 let text = `Here are the links that we found for ${event.text.split(/\s+/)[0]}\n \n ${result.crawler.pageData.queryResults[0].map((r) => {
   if (r.attr.startsWith('http://') || r.attr.startsWith('https://') || r.attr.startsWith('//')) {
       return r.attr;
   } else {
       return result.crawler.pageData.url + r.attr;
   }
 }).join(' \n ')}`;
 console.log(`Running [Slack → Send a Message from your Bot to a Channel]...`);
 result.slack.response = await lib.slack.channels['@0.6.6'].messages.create({
   channel: `#${event.channel_id}`,
   text: text
 })
 return result;
};

The first line of code imports an NPM package called “lib” to allow us to communicate with other APIs on top of Standard Library:

const lib = require(‘lib’)({token: process.env.STDLIB_SECRET_TOKEN});

Lines 2–6 is a comment that serves as documentation and allows Standard Library to type check calls to our functions. If a call does not supply a parameter with a correct (or expected type) it would return an error.

Line 7 is a function (module.exports) that will export our entire code found in lines 8–54. Once we deploy our code, this function will be wrapped into an HTTP endpoint (API endpoint) and it’ll automatically register with Slack so that every time a Slack command event happens, Slack will send the event payload for our API endpoint to consume.

Lines 11–16 is an if statement that handles improper inputs and posts a message to Slack using lib.slack.channels['@0.6.6'].messages.create.

Lines 18–21 makes an HTTP GET request to the lib.slack.conversations[‘@0.2.5’] API and uses the info method to retrieve the channel object which has info about the channel including name, topic, purpose etc and stores it in result.slack.channel.

Lines 22–25 also makes an HTTP GET request to lib.slack.users[‘@0.3.32’] and uses the retrieve method to get the user object which has info about the user and stores it in result.slack.user.

Lines 27–39 is making an HTTP GET request to lib.crawler.query['@0.0.1'] and passes in inputs from when a Slack command event is invoked. For the url we pass in the first input from our Slack event event.text.split(/\s+/)[0].

userAgent is set to the default: stdlib/crawler/query

includeMetadata is False (if True, will return additional metadata in a meta field in the response)

selectorQueries is an array with one object, the values being {selector:event.text.split(/\s+/)[1],resolver':'attr, attr: href}

For selector we retrieve the second input from the Slack event using event.text.split(/\s+/)[1].

Lines 40–53 creates and posts your message using the parameters that are passed in: channelId, Text.

You can read more about API specifications and parameters here: https://docs.stdlib.com/connector-apis/building-an-api/api-specification/

Making Changes

Now that your app is live, you can return at any time to add additional logic and scrape websites for data with crawler.api.

There are two ways to modify your application. The first is via our in-browser editor, Autocode. The second is via the Standard Library CLI.

via Web Browser

Simply visit Autocode.com and select your project. You can easily make updates and changes this way, save your changes and deploy directly from your browser.

Shipping to Production

Standard Library has easy dev/prod environment management, if you’d like to ship to production, visit build.stdlib.com, find your project and select manage.

From the environment management screen, simply click Ship Release.

Link any necessary resources, specify the version of the release and click Create Release to proceed.

That’s all you need to do!

Support

Via Slack: libdev.slack.com

You can request an invitation by clicking Community > Slack in the top bar on https://stdlib.com.

Via Twitter: @Sandard Library

Via E-mail: support@stdlib.com

In this guide, we will learn to retrieve and send scraped data into Slack. We’ll quickly set up a Slack app that scrapes websites for links using a slash command and posts the results inside a Slack channel like this:

Once you deploy this Slack App live, you can return and modify the code to add additional logic, scrape different data.

Use Case

📰 Scrape data from news websites, share, compare and discuss inside a Slack channel.

💌 Scrape domains for email addresses and apply enrichment with Clearbit APIs - share with your Sales team inside Slack

☎️ Quickly Gather phone numbers and assist your SDR team in call calling campaigns inside Slack

🔎Assist your marketing team by pulling data from forums and social media to perform sentiment analysis all within Slack.

Please remember to respect the policies around web crawlers of any sites you scrape.

Table of Contents:

  • Install from Github
  • Test Your Slack App Website Scraper
  • Making Changes
  • Support
  • Acknowledgments

Install from Github

Head on over to Github to fork my project’s code 👉🏼https://github.com/JanethL/SlackAppWebscraper/blob/master/README.md

Click the Open in Autocode button. You will be prompted to sign in or create a FREE account. If you have a Standard Library account click Already Registered and sign in using your Standard Library credentials.

Give your project a unique name and select Start API Project from Github

Autocode automatically sets up a project scaffold to save your project as an API endpoint, but it hasn’t been deployed.

To deploy your API to the cloud navigate through the functions/events/slack/command/ folders and select scrape.js file.

Select the 1 Account Required red button which will prompt you to link a Slack account.

If you’ve built Slack apps with Standard Library, you’ll see existing Slack accounts, or you can select Link New Resource to link a new Slack app.

Select Install Standard Library App.

You should see an OAuth popup that looks something like this:

Select Allow. You’ll have the option to customize your Slack app with a name and image.

Select Finish. The green checkmarks confirm that you’ve linked your accounts correctly. Click Finished Linking.

To deploy your API to the cloud select Deploy API in the bottom-left of the file manager.

🙌 Test Your Slack App Website Scraper

You’re all done. Try it out! Your Slack App is now available for use in the Slack workspace you authorized it for.

Your Slack app should respond to:
/cmd scrape <url> <selector> as I show in the screenshot:

I’ve included an additional command as a cheatsheet and list a few websites and their selectors to retrieve links.

Just type /cmd list and you should see your app respond with the following message.

Or review the previous tutorial to learn how to scrape using css selectors.

How It Works

When you submit /cmd scrape https://techcrunch.com/ a.post-block__title__link (or any URL followed by its respective selector) in Slack’s message box, a webhook will be triggered. The webhook, built and hosted on Standard Library, will first make a request to crawler.api, which will return a JSON payload with results from the query.

Our webhook will then create Slack messages for each event and post those to the channel where the command was invoked.

const lib = require('lib')({token: process.env.STDLIB_SECRET_TOKEN});
/**
* An HTTP endpoint that acts as a webhook for Slack command event
* @param {object} event
* @returns {object} result Your return value
*/
module.exports = async (event) => {
 // Store API Responses
 const result = {slack: {}, crawler: {}};
 
 if ((event.text || '').split(/\s+/).length != 2) {
   return lib.slack.channels['@0.6.6'].messages.create({
     channel: `#${event.channel_id}`,
     text: `${event.text} has wrong format. `
   });
 }
 
 console.log(`Running [Slack → Retrieve Channel, DM, or Group DM by id]...`);
 result.slack.channel = await lib.slack.conversations['@0.2.5'].info({
     id: `${event.channel_id}`
 });
 console.log(`Running [Slack → Retrieve a User]...`);
 result.slack.user = await lib.slack.users['@0.3.32'].retrieve({
     user: `${event.user_id}`
 });
 
 console.log(`Running [Crawler → Query (scrape) a provided URL based on CSS selectors]...`);
 result.crawler.pageData = await lib.crawler.query['@0.0.1'].selectors({
     url: event.text.split(/\s+/)[0],
     userAgent: `stdlib/crawler/query`,
     includeMetadata: false,
     selectorQueries: [
         {
             'selector': event.text.split(/\s+/)[1],
             'resolver': `attr`,
             'attr': 'href'
         }
     ]
 });
 let text = `Here are the links that we found for ${event.text.split(/\s+/)[0]}\n \n ${result.crawler.pageData.queryResults[0].map((r) => {
   if (r.attr.startsWith('http://') || r.attr.startsWith('https://') || r.attr.startsWith('//')) {
       return r.attr;
   } else {
       return result.crawler.pageData.url + r.attr;
   }
 }).join(' \n ')}`;
 console.log(`Running [Slack → Send a Message from your Bot to a Channel]...`);
 result.slack.response = await lib.slack.channels['@0.6.6'].messages.create({
   channel: `#${event.channel_id}`,
   text: text
 })
 return result;
};

The first line of code imports an NPM package called “lib” to allow us to communicate with other APIs on top of Standard Library:

const lib = require(‘lib’)({token: process.env.STDLIB_SECRET_TOKEN});

Lines 2–6 is a comment that serves as documentation and allows Standard Library to type check calls to our functions. If a call does not supply a parameter with a correct (or expected type) it would return an error.

Line 7 is a function (module.exports) that will export our entire code found in lines 8–54. Once we deploy our code, this function will be wrapped into an HTTP endpoint (API endpoint) and it’ll automatically register with Slack so that every time a Slack command event happens, Slack will send the event payload for our API endpoint to consume.

Lines 11–16 is an if statement that handles improper inputs and posts a message to Slack using lib.slack.channels['@0.6.6'].messages.create.

Lines 18–21 makes an HTTP GET request to the lib.slack.conversations[‘@0.2.5’] API and uses the info method to retrieve the channel object which has info about the channel including name, topic, purpose etc and stores it in result.slack.channel.

Lines 22–25 also makes an HTTP GET request to lib.slack.users[‘@0.3.32’] and uses the retrieve method to get the user object which has info about the user and stores it in result.slack.user.

Lines 27–39 is making an HTTP GET request to lib.crawler.query['@0.0.1'] and passes in inputs from when a Slack command event is invoked. For the url we pass in the first input from our Slack event event.text.split(/\s+/)[0].

userAgent is set to the default: stdlib/crawler/query

includeMetadata is False (if True, will return additional metadata in a meta field in the response)

selectorQueries is an array with one object, the values being {selector:event.text.split(/\s+/)[1],resolver':'attr, attr: href}

For selector we retrieve the second input from the Slack event using event.text.split(/\s+/)[1].

Lines 40–53 creates and posts your message using the parameters that are passed in: channelId, Text.

You can read more about API specifications and parameters here: https://docs.stdlib.com/connector-apis/building-an-api/api-specification/

Making Changes

Now that your app is live, you can return at any time to add additional logic and scrape websites for data with crawler.api.

There are two ways to modify your application. The first is via our in-browser editor, Autocode. The second is via the Standard Library CLI.

via Web Browser

Simply visit Autocode.com and select your project. You can easily make updates and changes this way, save your changes and deploy directly from your browser.

Shipping to Production

Standard Library has easy dev/prod environment management, if you’d like to ship to production, visit build.stdlib.com, find your project and select manage.

From the environment management screen, simply click Ship Release.

Link any necessary resources, specify the version of the release and click Create Release to proceed.

That’s all you need to do!

Support

Via Slack: libdev.slack.com

You can request an invitation by clicking Community > Slack in the top bar on https://stdlib.com.

Via Twitter: @Sandard Library

Via E-mail: support@stdlib.com

You must be a member to view the full lesson

Get started with
Makerpad today

The #1 platform for no-code education. Join over 10k others and discover what's possible.

Business

Multiple seat access, hire talent and custom training.

Learn more
Individuals

Unlimited all-access to our online bootcamps and community.

Get started
Email updates
You're in 😍- check your email to get started. Tweet me what you want to build to see how to do it without code.
Oops! Something went wrong while submitting the form.

Get help and discuss

Open community forum

Related lessons

Setting up user profiles and authentication on Webflow with 8base
Creating a Wikipedia Clone with Bubble
Develop a custom CRM using Retool and 8base
Send emails from Zapier and get analytics with Palabra
Generating serverless functions in the CLI - 8base
Connecting to a workspace GraphQL API - 8base
Authorization basics with roles and permissions - 8base
Working with GraphQL queries in the API Explorer - 8base
Defining data tables and relationships in the data builder - 8base
How to Easily Scrape Websites for Data using Autocode
Greet new members privately in Slack with Autocode + Block Kit Builder
Working with APIs - integrate Slack with Typeform using Autocode on Standard Library
Airtable - the basics
Generate QR Codes from Links
Linktree Clone with Adalo
Build an Airtable powered mobile app with Adalo
Rank blog posts in Webflow by pageviews
Fitness planner + tracker with Airtable and Twilio on Standard Library
Dynamically generate Google Slides
Dating app with WordPress and CometChat Go
How to batch process Webflow item IDs using Integromat
Instagram clone with Bubble
A Meetup clone built in Adalo
Automated sales reporting for Shopify with Integromat
A Vegan meal planner with Bubble
Get data from an API with Parabola + Google Sheets
Cameo clone in Glide - book video shoutouts from celebrities
A calculator app in Boundless
Send SMS Surveys with Twilio + Airtable on Standard Library
Custom Status Page with Standard Library & Airtable
Monitor and save posts from Reddit
Build a Twilio SMS Hub with Standard Library
Instagram clone using Glide and a Google Sheet
How to create an automated Google Sheets dashboard
Sending a SMS Message from a website
Build a Slack / Airtable / Stripe CRM
From Coinmartketcap to Google Sheets without any code
How to automatically feed data to google sheets
Export data from MySQL into a CSV in Google Sheets
Meditation mini-app and tracking progress
Changing data on a live site
Browse all →

If you'd like this template, message @bentossell on Slack with your email for the account to send this template.