c# .net

How to create a web crawler in asp.net c#?

How to create a web crawler in asp.net c#?, someone asked me to explain?

In this article, I will show you how to create a website url crawler using asp.net c#. you can crawl web pages and extract data from a website by inputs the url.  It requests the web page and then get response the data programmatically.

The piece of following code written in c# language helps to crawl web page and extract data from website.

Html Design Code:

Create an asp.net web application and right click on the application and create a new web form and name it as CrawlData.aspx. Copy and paste the following design code on it.

<form id="form1" runat="server" style="text-align: center">
        <div style="border: 1px solid #DED8D8; width: 750px; height: 550px; font-family: Arial;">
            <h2>Crawl website URL</h2>
            <asp:TextBox ID="txtUrl" runat="server"></asp:TextBox>
            <asp:Button ID="btnCrawl" Text="Crawl" runat="server" OnClick="btnCrawl_Click" Style="height: 26px" />
            <br />
            <br />
            <iframe style="width: 750px; height: 100%;" id="irm1" src="CrawlData/new.html" runat="server"></iframe>

Code behind:

protected void btnCrawl_Click(object sender, EventArgs e)
            string url = txtUrl.Text;
            WebRequest request = WebRequest.Create(url);
            string path = Server.MapPath("~/CrawlData/");
            using (WebResponse response = request.GetResponse())
                using (StreamReader responseReader =
                  new StreamReader(response.GetResponseStream()))
                    string responseData =responseReader.ReadToEnd();
                    using (StreamWriter writer=
                      new StreamWriter(path+ "new.html"))
            irm1.Src = "CrawlData/new.html";

Description: Run the application and enter the page url you want to crawl. Click the “crawl” button, It request the web page and crawl data from website and save it in the project folder “CrawlData”. Here I have entered this url "https://www.google.co.in" , it crawls the web page and displayed on the iframe.

website page crawler

Post your comments / questions