Web Scraper API - Examples

📋 API Endpoints

GET /scraper/

ข้อมูลพื้นฐานของ API

POST /scraper/scrape

Scraping เว็บไซต์ด้วย Beautiful Soup หรือ Selenium

POST /scraper/scrape-and-embed

Scraping เว็บไซต์และสร้าง embeddings

POST /scraper/scrape-and-analyze

Scraping เว็บไซต์และวิเคราะห์ด้วย Ollama AI

🔧 การใช้งานพื้นฐาน

1. Scraping ด้วย Beautiful Soup

curl -X POST "https://ai.clearcrowd.net/scraper/scrape" \
-H "Content-Type: application/json" \
-d '{
  "url": "https://example.com",
  "method": "beautifulsoup",
  "selectors": {
    "title": "h1",
    "description": "meta[name=description]",
    "links": "a"
  }
}'
            

2. Scraping ด้วย Selenium (สำหรับเว็บที่ใช้ JavaScript)

curl -X POST "https://ai.clearcrowd.net/scraper/scrape" \
-H "Content-Type: application/json" \
-d '{
  "url": "https://spa-website.com",
  "method": "selenium",
  "wait_time": 10,
  "selectors": {
    "dynamic_content": ".dynamic-element",
    "loaded_data": "[data-loaded=true]"
  }
}'
            

🤖 AI Integration

1. Scraping + Embeddings

curl -X POST "https://ai.clearcrowd.net/scraper/scrape-and-embed" \
-H "Content-Type: application/json" \
-d '{
  "url": "https://news-website.com/article",
  "method": "beautifulsoup",
  "selectors": {
    "headline": "h1",
    "content": ".article-body",
    "author": ".author-name"
  }
}'
            

2. Scraping + AI Analysis

curl -X POST "https://ai.clearcrowd.net/scraper/scrape-and-analyze" \
-H "Content-Type: application/json" \
-d '{
  "url": "https://blog.example.com/post",
  "method": "beautifulsoup",
  "selectors": {
    "title": "h1",
    "content": ".post-content",
    "tags": ".tag"
  }
}'
            

🛠️ เครื่องมือที่รองรับ

🍲 Beautiful Soup

เร็วและเบา
เหมาะกับเว็บ static
รองรับ CSS selectors
ประหยัด resources

🎭 Selenium

รองรับ JavaScript
เหมาะกับ SPA
รองรับ XPath
Headless browser

📝 Parameters

Request Body

{
  "url": "string (required)",           // URL ที่จะ scrape
  "method": "beautifulsoup|selenium",   // วิธีการ scraping
  "selectors": {                        // CSS selectors หรือ XPath
    "field_name": "selector"
  },
  "wait_time": 5,                       // เวลารอ (สำหรับ Selenium)
  "headers": {                          // Custom headers
    "User-Agent": "custom-agent"
  }
}
            

Response

{
  "success": true,
  "url": "scraped_url",
  "title": "page_title",
  "content": "full_page_content",       // ถ้าไม่มี selectors
  "data": {                            // ข้อมูลที่ extract ด้วย selectors
    "field_name": "extracted_value"
  },
  "metadata": {
    "method": "beautifulsoup",
    "status_code": 200,
    "content_type": "text/html"
  },
  "timestamp": "2025-09-28T12:00:00"
}
            

💡 ตัวอย่างการใช้งาน

Python

import requests

# Scraping ข่าว
response = requests.post('https://ai.clearcrowd.net/scraper/scrape-and-analyze', 
    json={
        'url': 'https://news-website.com/article',
        'method': 'beautifulsoup',
        'selectors': {
            'headline': 'h1',
            'content': '.article-body',
            'date': '.publish-date'
        }
    }
)

result = response.json()
print(f"Title: {result['scrape_result']['data']['headline']}")
print(f"Analysis: {result['analysis']}")
            

JavaScript

// Scraping e-commerce
const response = await fetch('https://ai.clearcrowd.net/scraper/scrape', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify({
        url: 'https://shop.example.com/product',
        method: 'selenium',
        selectors: {
            price: '.price',
            title: 'h1',
            rating: '.rating-stars'
        }
    })
});

const data = await response.json();
console.log('Product:', data.data);
            

🕷️ Web Scraper API

📋 API Endpoints

GET /scraper/

POST /scraper/scrape

POST /scraper/scrape-and-embed

POST /scraper/scrape-and-analyze

🔧 การใช้งานพื้นฐาน

1. Scraping ด้วย Beautiful Soup

2. Scraping ด้วย Selenium (สำหรับเว็บที่ใช้ JavaScript)

🤖 AI Integration

1. Scraping + Embeddings

2. Scraping + AI Analysis

🛠️ เครื่องมือที่รองรับ

🍲 Beautiful Soup

🎭 Selenium

📝 Parameters

Request Body

Response

💡 ตัวอย่างการใช้งาน

Python

JavaScript