Project DescriptionData Extracting SDK can help you to extract information from the web resources in a simple way.
This framework allows you to extract different data from text or web sources, analyze text content, extract different information and write your own data mining & extracting applications and services.
Licensing
Data Extracting SDK Open Source GPL (Open Source License)This is a suitable option if you are building an application for internal use or an open source application with a license compatible with the GNU GPL v2.0.
Data Extracting SDK Commercial License (Developer License with Subscription and Priority Support)This is a suitable option if you are building closed-source commercial products for redistribution or if you wish to avoid integrating open-source components into your application.
Features
- DOM analysis
- Emails, phones, images, links extracting
- websites meta information extracting
- rich data parser possibilities
- websites screenshots extracting
- rich HTML processing
- and more
Simple Example
This sample extracts all emails from the given web page:
using System;
using System.Data.Extracting;
namespace VS2010Demo
{
class Program
{
static void Main(string[] args)
{
DataExtractor ext = new DataExtractor(new Uri("http://msug.vn.ua/"), DataTypes.Email);
var results = ext.GetExtractedResults();
foreach (var item in results)
{
Console.WriteLine("{0}: {1}", item.GroupName, item.Value);
}
Console.Write("\nPress any key to exit...");
Console.ReadKey();
}
}
}