Documentation
Complete guide to using the DDG Web Search package.
Getting Started
The DDG Web Search package provides a comprehensive solution for searching the web and fetching web content with built-in rate limiting, error handling, and multiple interfaces.
Installation
npm install @lucid-spark/ddg-web-search
Quick Start
import { WebSearcher, WebContentFetcher } from "@lucid-spark/ddg-web-search";
const searcher = new WebSearcher();
const fetcher = new WebContentFetcher();
// Search the web
const results = await searcher.search("TypeScript");
// Fetch web content
const content = await fetcher.fetch("https://example.com");
API Reference
Complete reference for all classes, methods, and interfaces provided by the package.
WebSearcher
The main class for searching the web using DuckDuckGo with browser automation. Features built-in rate limiting, captcha handling, and error handling.
Constructor
new WebSearcher(headless?: boolean)
Creates a new WebSearcher instance with browser automation capabilities.
- headless (optional): Whether to run browser in headless mode (default: true)
Methods
search(query: string): Promise<SearchResult[]>
Searches the web using DuckDuckGo with browser automation and returns formatted results.
Parameters
- query (string): The search query string
Returns
Promise resolving to an array of SearchResult objects.
Features
- JavaScript rendering support for dynamic content
- Captcha detection and handling
- Anti-bot detection countermeasures
- Conservative rate limiting (1 request per 2 seconds)
Example
const searcher = new WebSearcher();
const results = await searcher.search("Node.js tutorials");
results.forEach(result => {
console.log(result.title);
console.log(result.url);
console.log(result.snippet);
});
// Important: cleanup browser resources
await searcher.close();
close(): Promise<void>
Closes the browser instance and frees up system resources. Important for preventing memory leaks.
Example
try {
const results = await searcher.search("query");
// process results
} finally {
await searcher.close(); // Always cleanup
}
WebContentFetcher
Class for fetching and parsing web content with configurable rate limiting.
Constructor
new WebContentFetcher(rateLimit?: number, rateLimitInterval?: number)
Parameters
- rateLimit (optional): Number of requests allowed per interval (default: 1)
- rateLimitInterval (optional): Time interval in milliseconds (default: 1000)
Methods
fetch(url: string): Promise<FetchResult>
Fetches content from the specified URL with rate limiting and error handling.
Parameters
- url (string): The URL to fetch content from
Returns
Promise resolving to a FetchResult object.
Example
// Default rate limiting (1 request per second)
const fetcher = new WebContentFetcher();
// Custom rate limiting (2 requests per 3 seconds)
const customFetcher = new WebContentFetcher(2, 3000);
const result = await fetcher.fetch("https://example.com");
if (result.success) {
console.log("Content:", result.data?.content);
console.log("Title:", result.data?.metadata?.title);
console.log("Description:", result.data?.metadata?.description);
} else {
console.error("Error:", result.error);
}
CLI Interface
Command-line interface for easy access to search and fetch functionality.
Installation
npm install -g @lucid-spark/ddg-web-search
Commands
ddg-web-search search <query>
Search the web for the given query using browser automation.
ddg-web-search search "TypeScript tutorials"
ddg-web-search fetch <url>
Fetch content from the specified URL with intelligent parsing.
ddg-web-search fetch https://example.com
ddg-web-search interactive
Start interactive mode for multiple commands with enhanced user experience.
ddg-web-search interactive
ddg-web-search mcp
Start MCP server with stdio transport for AI assistant integration.
ddg-web-search mcp
ddg-web-search mcp-http
Start MCP server with HTTP transport for web-based AI assistant integration.
ddg-web-search mcp-http
Interactive Mode Commands
search <query>ors <query>- Search the webfetch <url>orf <url>- Fetch web contentmcp- Start MCP server (stdio transport)mcp-http- Start MCP server (HTTP transport)helporh- Show helpversionorv- Show versionclearorcls- Clear screenexitorquitorq- Exit
MCP Server
Model Context Protocol server for AI assistant integration with support for both stdio and HTTP transports.
Starting the Server
Stdio Transport (Default)
# Using global binary
ddg-web-search-mcp
# Using npx
npx @lucid-spark/ddg-web-search mcp
# Using built files
node dist/mcp.js
HTTP Transport
# Default host and port (localhost:3001)
node dist/mcp.js --transport http
# Custom host and port
node dist/mcp.js --transport http --host 0.0.0.0 --port 8080
Configuration
Add to your MCP client configuration (e.g., Claude Desktop):
{
"mcpServers": {
"ddg-web-search": {
"command": "@lucid-spark/ddg-web-search-mcp",
"args": [],
"env": {}
}
}
}
Available Tools
search
Search the web using DuckDuckGo with browser automation.
Input: { "query": "search terms" }
Output: Formatted list of search results with titles, URLs, and snippets
Features: JavaScript rendering, captcha handling, anti-detection measures
fetch_web_content
Fetch and parse content from a web URL with intelligent scraping.
Input: { "url": "https://example.com" }
Output: Parsed content with metadata (truncated to 10,000 characters if needed)
Features: HTML-to-Markdown conversion, metadata extraction, content cleaning
HTTP Endpoints
When using HTTP transport, the server provides:
- GET / - Server information and available endpoints
- GET /sse - Server-Sent Events connection for receiving responses
- POST /message/{sessionId} - Send MCP requests to the server
Browser Automation
The WebSearcher uses Puppeteer for reliable browser automation, providing better captcha handling and JavaScript rendering support.
Key Features
- JavaScript Rendering: Full support for dynamically loaded content
- Captcha Handling: Intelligent detection and handling of captcha challenges
- Anti-Detection: Uses real browser behavior to avoid bot detection
- Resource Management: Proper cleanup of browser instances to prevent memory leaks
Browser Configuration
// Headless mode (default) - no browser window
const searcher = new WebSearcher();
// Non-headless mode for debugging or manual captcha solving
const debugSearcher = new WebSearcher(false);
Resource Management
Always call close() to prevent memory leaks:
const searcher = new WebSearcher();
try {
const results = await searcher.search("query");
// Process results
} finally {
await searcher.close(); // Essential for cleanup
}
Captcha Handling
- Headless Mode: Returns empty results if captcha is detected
- Non-Headless Mode: Allows time for manual captcha solving
- Detection: Automatically detects captcha challenges on the page
Performance Considerations
- Browser Overhead: Higher resource usage than HTTP requests
- Initialization: Browser launch takes 1-2 seconds
- Memory Usage: Monitor for memory leaks in long-running applications
- Browser Reuse: Browser instance is reused across searches for efficiency
Utilities
Utility classes used internally and available for advanced usage.
RateLimiter
Token bucket rate limiter for controlling request frequency.
import { RateLimiter } from "@lucid-spark/ddg-web-search";
// 5 requests per 10 seconds
const rateLimiter = new RateLimiter(5, 10000);
// Wait for permission to make a request
await rateLimiter.acquire();
// Get current status
const status = rateLimiter.getStatus();
console.log(status.requests, status.limit, status.interval);
HttpClient
Singleton HTTP client with centralized error handling and logging.
import { HttpClient } from "@lucid-spark/ddg-web-search";
const client = HttpClient.getInstance();
// GET request
const data = await client.get("https://api.example.com");
// POST request
const result = await client.post("https://api.example.com", { key: "value" });
TypeScript Types
Complete type definitions for all interfaces and types used in the package.
SearchResult
interface SearchResult {
title: string; // The title of the search result
url: string; // The URL of the search result
snippet: string; // A brief description or snippet
icon?: string; // Optional icon URL for the result
}
WebContent
interface WebContent {
content: string; // Main content of the page (HTML converted to Markdown)
metadata?: { // Optional metadata extracted from the page
title?: string; // Page title
description?: string; // Page description
url?: string; // Page URL
author?: string; // Page author
publishDate?: string; // Publication date
};
}
FetchResult
interface FetchResult {
success: boolean; // Whether the fetch was successful
data?: WebContent; // Parsed web content (if available)
error?: string; // Error message (if failed)
}
FetchOptions
interface FetchOptions {
timeout?: number; // Request timeout in milliseconds
headers?: Record; // Custom headers
}
Error Handling
The package provides comprehensive error handling for various failure scenarios.
Search Errors
- Empty or invalid queries return empty results arrays
- Network errors are caught and logged, returning empty arrays
- Invalid API responses are handled gracefully
Fetch Errors
- Invalid URLs return error results with descriptive messages
- Network timeouts and connection errors are caught
- HTTP error status codes are handled appropriately
Example Error Handling
try {
const results = await searcher.search("query");
if (results.length === 0) {
console.log("No results found");
}
const fetchResult = await fetcher.fetch("https://example.com");
if (!fetchResult.success) {
console.error("Fetch failed:", fetchResult.error);
}
} catch (error) {
console.error("Unexpected error:", error);
}
Rate Limiting
Built-in rate limiting prevents overwhelming external services and ensures respectful API usage.
Default Rate Limits
- WebSearcher: 1 request per 2 seconds (conservative for browser automation)
- WebContentFetcher: 1 request per second (configurable)
Custom Rate Limiting
// 3 requests per 5 seconds
const fetcher = new WebContentFetcher(3, 5000);
// Rate limiting is enforced automatically
for (let i = 0; i < 5; i++) {
const result = await fetcher.fetch(`https://httpbin.org/get?id=${i}`);
console.log(`Request ${i + 1} completed`);
}
Rate Limiter API
const rateLimiter = new RateLimiter(2, 1000); // 2 per second
// Check status
const status = rateLimiter.getStatus();
console.log(`${status.requests}/${status.limit} requests used`);
// Acquire permission (waits if necessary)
await rateLimiter.acquire();
console.log("Permission granted for request");
Testing
The package includes comprehensive test suites and supports easy mocking for unit tests.
Running Tests
# Run all tests
npm test
# Run tests with coverage
npm run test -- --coverage
# Run specific test file
npm test -- WebSearcher.test.ts
Mocking for Tests
import { WebSearcher } from '@lucid-spark/ddg-web-search';
import axios from 'axios';
jest.mock('axios');
const mockedAxios = axios as jest.Mocked;
describe('WebSearcher', () => {
it('should return search results', async () => {
mockedAxios.get.mockResolvedValue({
data: { RelatedTopics: [/* mock data */] }
});
const searcher = new WebSearcher();
const results = await searcher.search('test');
expect(results).toHaveLength(1);
});
});
Migration Guide
Guide for migrating from older versions or similar packages.
Breaking Changes
This section will be updated as new versions are released with any breaking changes and migration instructions.
Upgrading
# Update to latest version
npm update @lucid-spark/ddg-web-search
# Check for security updates
npm audit
# Update global installation
npm install -g @lucid-spark/ddg-web-search@latest