Quick Start
Get up and running in under 2 minutes with our step-by-step guide
Schema Builder
Learn how to create custom data structures visually
Field Types
Explore 100+ available field types for realistic data
FAQ
Find answers to commonly asked questions
Quick Start Guide
Generate your first dataset in 4 simple steps
Choose Your Method
Upload an existing schema file (JSON, YAML, or CSV) or use the visual Schema Builder to create one from scratch. Try loading an example schema to see how it works.
Define Your Data Structure
Add entities (tables) and fields (columns). Choose from 100+ field types including names, emails, addresses, UUIDs, dates, and more.
Configure Settings
Set the number of records, choose your output format (JSON, CSV, SQL, or XML), select a locale, and optionally add realistic noise/errors.
Generate & Export
Click "Generate Data" and watch the magic happen. Copy to clipboard or download your synthetic dataset instantly.
Pro Tip
Start with an example schema (Users, E-commerce, or Employees) to understand the structure, then modify it for your needs.
Using the Schema Builder
Create custom data structures without writing code
Adding Entities
Entities represent tables or collections. Click "+ Add Entity" and give it a name like "users", "products", or "orders". Each entity will generate its own set of records.
Adding Fields
Fields are the columns in your entity. Specify the field name, choose a type from 100+ options, and optionally mark it as nullable or unique.
Field Options
Some types have extra options: Enum lets you specify custom values, Number/Integer can have min/max ranges, Reference creates foreign key links.
Relationships
Connect entities with 1:1, 1:N, or N:N relationships. The generator ensures referential integrity across your dataset.
Uploading Schema Files
Import existing schemas in JSON, YAML, or CSV format
{
"entities": [
{
"name": "users",
"fields": [
{ "name": "id", "type": "uuid" },
{ "name": "email", "type": "email" },
{ "name": "name", "type": "fullName" },
{ "name": "age", "type": "integer", "min": 18, "max": 65 },
{ "name": "status", "type": "enum", "values": ["active", "pending", "inactive"] }
]
}
]
}
entities:
- name: users
fields:
- name: id
type: uuid
- name: email
type: email
- name: name
type: fullName
- name: created_at
type: datetime
Field Types Reference
100+ field types organized by category
| Category | Types | Example Output |
|---|---|---|
| Basic |
string number integer boolean date datetime
|
"hello", 42.5, true, "2024-01-15" |
| Personal |
firstName lastName fullName email phone username age gender
|
"John", "Doe", "john.doe@email.com" |
| Address |
address street city state country zipCode latitude longitude
|
"123 Main St", "New York", "10001" |
| Business |
company jobTitle department product price creditCard iban
|
"Acme Corp", "Engineer", "$99.99" |
| Internet |
url domain ip ipv6 mac userAgent
|
"https://example.com", "192.168.1.1" |
| Identifiers |
uuid id mongoId nanoid slug
|
"550e8400-e29b-41d4-a716..." |
| Text |
word words sentence paragraph lorem
|
"Lorem ipsum dolor sit amet..." |
| Custom |
enum weightedEnum regex reference
|
Values from your custom list |
Error & Noise Settings
Add realistic imperfections for testing
Null Values Rate
Randomly replaces values with NULL to test handling of missing data. Recommended: 5-15%
Typo Rate
Introduces character swaps, deletions, and substitutions in text. Great for testing fuzzy matching.
Format Error Rate
Creates malformed emails, dates, and formatted fields. Perfect for testing input validation.
Duplicate Rate
Adds duplicate records to simulate real-world data entry errors. Test your deduplication logic.
Outlier Rate
Generates extreme numeric values for testing edge cases and anomaly detection algorithms.
Recommendation
High error rates (>20%) may produce data that's too noisy for most testing. Start with 5-10% for realistic scenarios.
Performance & Capacity
Built for speed and scale
| Schema Complexity | Records | Time | Output Size |
|---|---|---|---|
| Simple (5 fields) | 1,000,000 | ~4 seconds | ~90 MB |
| Medium (30 fields) | 500,000 | ~15 seconds | ~500 MB |
| Complex (80+ fields) | 250,000 | ~24 seconds | ~730 MB |
Reproducible Data
Use the seed feature in Settings to generate identical data every time. Perfect for consistent test fixtures and sharing datasets.
Frequently Asked Questions
Quick answers to common questions
No! All data generation happens entirely in your browser using JavaScript. Your schemas and generated data never leave your computer. The app works completely offline once loaded.
Yes! You can manually recreate your schema using the Schema Builder, or export your database schema to JSON/YAML and upload it. The field types map closely to common database column types.
Use the Relationships tab to define connections. For example: create a "user_id" field in orders, then add a relationship from orders.user_id to users.id. The generator ensures referential integrity automatically.
Yes! Change the Locale setting to generate names, addresses, and locale-specific data in German, French, Spanish, Italian, Japanese, or Chinese with appropriate formats and characters.
Use the "Enum" field type and enter comma-separated values like "active,pending,cancelled". For weighted distribution, use "Weighted Enum" with weights: "active:70,pending:20,cancelled:10".
This can happen with large datasets for fields with limited values (like first names). Use UUID or ID types for truly unique fields, or reduce the record count.
Yes! The Synthetic Data Generator is completely free for personal and commercial projects. No usage limits, no sign-up required, and no watermarks on generated data.
For large datasets (100K+), try: reducing records, simplifying your schema, using Chrome or Firefox, closing other tabs, or generating in smaller batches.
Ready to generate data?
Start creating realistic synthetic datasets in seconds
Open Generator →