Documentation

Complete guide to using the DDG Web Search package.

Getting Started

The DDG Web Search package provides a comprehensive solution for searching the web and fetching web content with built-in rate limiting, error handling, and multiple interfaces.

Installation

npm install @lucid-spark/ddg-web-search

Quick Start

import { WebSearcher, WebContentFetcher } from "@lucid-spark/ddg-web-search"; const searcher = new WebSearcher(); const fetcher = new WebContentFetcher(); // Search the web const results = await searcher.search("TypeScript"); // Fetch web content const content = await fetcher.fetch("https://example.com");

API Reference

Complete reference for all classes, methods, and interfaces provided by the package.

WebSearcher

The main class for searching the web using DuckDuckGo with browser automation. Features built-in rate limiting, captcha handling, and error handling.

Constructor

new WebSearcher(headless?: boolean)

Creates a new WebSearcher instance with browser automation capabilities.

  • headless (optional): Whether to run browser in headless mode (default: true)

Methods

search(query: string): Promise<SearchResult[]>

Searches the web using DuckDuckGo with browser automation and returns formatted results.

Parameters
  • query (string): The search query string
Returns

Promise resolving to an array of SearchResult objects.

Features
  • JavaScript rendering support for dynamic content
  • Captcha detection and handling
  • Anti-bot detection countermeasures
  • Conservative rate limiting (1 request per 2 seconds)
Example
const searcher = new WebSearcher(); const results = await searcher.search("Node.js tutorials"); results.forEach(result => { console.log(result.title); console.log(result.url); console.log(result.snippet); }); // Important: cleanup browser resources await searcher.close();

close(): Promise<void>

Closes the browser instance and frees up system resources. Important for preventing memory leaks.

Example
try { const results = await searcher.search("query"); // process results } finally { await searcher.close(); // Always cleanup }

WebContentFetcher

Class for fetching and parsing web content with configurable rate limiting.

Constructor

new WebContentFetcher(rateLimit?: number, rateLimitInterval?: number)
Parameters
  • rateLimit (optional): Number of requests allowed per interval (default: 1)
  • rateLimitInterval (optional): Time interval in milliseconds (default: 1000)

Methods

fetch(url: string): Promise<FetchResult>

Fetches content from the specified URL with rate limiting and error handling.

Parameters
  • url (string): The URL to fetch content from
Returns

Promise resolving to a FetchResult object.

Example
// Default rate limiting (1 request per second) const fetcher = new WebContentFetcher(); // Custom rate limiting (2 requests per 3 seconds) const customFetcher = new WebContentFetcher(2, 3000); const result = await fetcher.fetch("https://example.com"); if (result.success) { console.log("Content:", result.data?.content); console.log("Title:", result.data?.metadata?.title); console.log("Description:", result.data?.metadata?.description); } else { console.error("Error:", result.error); }

CLI Interface

Command-line interface for easy access to search and fetch functionality.

Installation

npm install -g @lucid-spark/ddg-web-search

Commands

ddg-web-search search <query>

Search the web for the given query using browser automation.

ddg-web-search search "TypeScript tutorials"

ddg-web-search fetch <url>

Fetch content from the specified URL with intelligent parsing.

ddg-web-search fetch https://example.com

ddg-web-search interactive

Start interactive mode for multiple commands with enhanced user experience.

ddg-web-search interactive

ddg-web-search mcp

Start MCP server with stdio transport for AI assistant integration.

ddg-web-search mcp

ddg-web-search mcp-http

Start MCP server with HTTP transport for web-based AI assistant integration.

ddg-web-search mcp-http

Interactive Mode Commands

  • search <query> or s <query> - Search the web
  • fetch <url> or f <url> - Fetch web content
  • mcp - Start MCP server (stdio transport)
  • mcp-http - Start MCP server (HTTP transport)
  • help or h - Show help
  • version or v - Show version
  • clear or cls - Clear screen
  • exit or quit or q - Exit

MCP Server

Model Context Protocol server for AI assistant integration with support for both stdio and HTTP transports.

Starting the Server

Stdio Transport (Default)

# Using global binary ddg-web-search-mcp # Using npx npx @lucid-spark/ddg-web-search mcp # Using built files node dist/mcp.js

HTTP Transport

# Default host and port (localhost:3001) node dist/mcp.js --transport http # Custom host and port node dist/mcp.js --transport http --host 0.0.0.0 --port 8080

Configuration

Add to your MCP client configuration (e.g., Claude Desktop):

{ "mcpServers": { "ddg-web-search": { "command": "@lucid-spark/ddg-web-search-mcp", "args": [], "env": {} } } }

Available Tools

search

Search the web using DuckDuckGo with browser automation.

Input: { "query": "search terms" }

Output: Formatted list of search results with titles, URLs, and snippets

Features: JavaScript rendering, captcha handling, anti-detection measures

fetch_web_content

Fetch and parse content from a web URL with intelligent scraping.

Input: { "url": "https://example.com" }

Output: Parsed content with metadata (truncated to 10,000 characters if needed)

Features: HTML-to-Markdown conversion, metadata extraction, content cleaning

HTTP Endpoints

When using HTTP transport, the server provides:

  • GET / - Server information and available endpoints
  • GET /sse - Server-Sent Events connection for receiving responses
  • POST /message/{sessionId} - Send MCP requests to the server

Browser Automation

The WebSearcher uses Puppeteer for reliable browser automation, providing better captcha handling and JavaScript rendering support.

Key Features

  • JavaScript Rendering: Full support for dynamically loaded content
  • Captcha Handling: Intelligent detection and handling of captcha challenges
  • Anti-Detection: Uses real browser behavior to avoid bot detection
  • Resource Management: Proper cleanup of browser instances to prevent memory leaks

Browser Configuration

// Headless mode (default) - no browser window const searcher = new WebSearcher(); // Non-headless mode for debugging or manual captcha solving const debugSearcher = new WebSearcher(false);

Resource Management

Always call close() to prevent memory leaks:

const searcher = new WebSearcher(); try { const results = await searcher.search("query"); // Process results } finally { await searcher.close(); // Essential for cleanup }

Captcha Handling

  • Headless Mode: Returns empty results if captcha is detected
  • Non-Headless Mode: Allows time for manual captcha solving
  • Detection: Automatically detects captcha challenges on the page

Performance Considerations

  • Browser Overhead: Higher resource usage than HTTP requests
  • Initialization: Browser launch takes 1-2 seconds
  • Memory Usage: Monitor for memory leaks in long-running applications
  • Browser Reuse: Browser instance is reused across searches for efficiency

Utilities

Utility classes used internally and available for advanced usage.

RateLimiter

Token bucket rate limiter for controlling request frequency.

import { RateLimiter } from "@lucid-spark/ddg-web-search"; // 5 requests per 10 seconds const rateLimiter = new RateLimiter(5, 10000); // Wait for permission to make a request await rateLimiter.acquire(); // Get current status const status = rateLimiter.getStatus(); console.log(status.requests, status.limit, status.interval);

HttpClient

Singleton HTTP client with centralized error handling and logging.

import { HttpClient } from "@lucid-spark/ddg-web-search"; const client = HttpClient.getInstance(); // GET request const data = await client.get("https://api.example.com"); // POST request const result = await client.post("https://api.example.com", { key: "value" });

TypeScript Types

Complete type definitions for all interfaces and types used in the package.

SearchResult

interface SearchResult { title: string; // The title of the search result url: string; // The URL of the search result snippet: string; // A brief description or snippet icon?: string; // Optional icon URL for the result }

WebContent

interface WebContent { content: string; // Main content of the page (HTML converted to Markdown) metadata?: { // Optional metadata extracted from the page title?: string; // Page title description?: string; // Page description url?: string; // Page URL author?: string; // Page author publishDate?: string; // Publication date }; }

FetchResult

interface FetchResult { success: boolean; // Whether the fetch was successful data?: WebContent; // Parsed web content (if available) error?: string; // Error message (if failed) }

FetchOptions

interface FetchOptions { timeout?: number; // Request timeout in milliseconds headers?: Record; // Custom headers }

Error Handling

The package provides comprehensive error handling for various failure scenarios.

Search Errors

  • Empty or invalid queries return empty results arrays
  • Network errors are caught and logged, returning empty arrays
  • Invalid API responses are handled gracefully

Fetch Errors

  • Invalid URLs return error results with descriptive messages
  • Network timeouts and connection errors are caught
  • HTTP error status codes are handled appropriately

Example Error Handling

try { const results = await searcher.search("query"); if (results.length === 0) { console.log("No results found"); } const fetchResult = await fetcher.fetch("https://example.com"); if (!fetchResult.success) { console.error("Fetch failed:", fetchResult.error); } } catch (error) { console.error("Unexpected error:", error); }

Rate Limiting

Built-in rate limiting prevents overwhelming external services and ensures respectful API usage.

Default Rate Limits

  • WebSearcher: 1 request per 2 seconds (conservative for browser automation)
  • WebContentFetcher: 1 request per second (configurable)

Custom Rate Limiting

// 3 requests per 5 seconds const fetcher = new WebContentFetcher(3, 5000); // Rate limiting is enforced automatically for (let i = 0; i < 5; i++) { const result = await fetcher.fetch(`https://httpbin.org/get?id=${i}`); console.log(`Request ${i + 1} completed`); }

Rate Limiter API

const rateLimiter = new RateLimiter(2, 1000); // 2 per second // Check status const status = rateLimiter.getStatus(); console.log(`${status.requests}/${status.limit} requests used`); // Acquire permission (waits if necessary) await rateLimiter.acquire(); console.log("Permission granted for request");

Testing

The package includes comprehensive test suites and supports easy mocking for unit tests.

Running Tests

# Run all tests npm test # Run tests with coverage npm run test -- --coverage # Run specific test file npm test -- WebSearcher.test.ts

Mocking for Tests

import { WebSearcher } from '@lucid-spark/ddg-web-search'; import axios from 'axios'; jest.mock('axios'); const mockedAxios = axios as jest.Mocked; describe('WebSearcher', () => { it('should return search results', async () => { mockedAxios.get.mockResolvedValue({ data: { RelatedTopics: [/* mock data */] } }); const searcher = new WebSearcher(); const results = await searcher.search('test'); expect(results).toHaveLength(1); }); });

Migration Guide

Guide for migrating from older versions or similar packages.

Breaking Changes

This section will be updated as new versions are released with any breaking changes and migration instructions.

Upgrading

# Update to latest version npm update @lucid-spark/ddg-web-search # Check for security updates npm audit # Update global installation npm install -g @lucid-spark/ddg-web-search@latest