Is the input uploaded?

No. Counting happens entirely in browser JavaScript. Safe for logs, drafts, and chat transcripts.

What's the difference between characters and words?

Characters splits per code point (Hello → H, e, l, l, o). Words splits on whitespace runs (Hello world → Hello, world). Languages like Japanese, where words aren't space-separated, are best handled in Characters mode or with a dedicated tokenizer (e.g. the kuromoji-based furigana-html tool).

Is Lines mode good for log analysis?

Yes — instantly shows which exact line repeats most. Empty lines are skipped, so you can paste full log files and quickly spot error patterns or canned responses.

What exactly does case-insensitive do?

Each token is normalised with `toLowerCase()` before counting and the table displays the normalised lowercase form. Turn it off to keep the original case.

How do I see only the top N?

All rows are shown in this version. Download the CSV and filter in Excel / Sheets, or scroll. For huge unique-token counts (tens of thousands) the browser can get sluggish — try a smaller sample first.

How is the share (%) calculated?

Occurrence count ÷ total tokens × 100, rounded to two decimals. In Characters mode with whitespace skipping, the denominator is the count after whitespace is removed.

Back to Developer

Text frequency — char / word / line tallies

Tally the occurrence of every character, word, or line in your text and rank them by frequency. Toggle case-insensitivity or whitespace stripping, then export the table as CSV. Runs entirely in your browser — drafts, logs, and chat transcripts stay local.

developertextcount

How to use

Pick the unit (characters / words / lines), paste your text, and press Count frequency. Tokens are ranked by their occurrence count, with ties broken alphabetically. Toggle case-insensitive to merge A and a, or skip whitespace (Characters mode only) to drop spaces and newlines. Copy the table as TSV or download as CSV for Excel / spreadsheets. Everything runs locally — your text never leaves the browser.

FAQ

Is the input uploaded?: No. Counting happens entirely in browser JavaScript. Safe for logs, drafts, and chat transcripts.
What's the difference between characters and words?: Characters splits per code point (Hello → H, e, l, l, o). Words splits on whitespace runs (Hello world → Hello, world). Languages like Japanese, where words aren't space-separated, are best handled in Characters mode or with a dedicated tokenizer (e.g. the kuromoji-based furigana-html tool).
Is Lines mode good for log analysis?: Yes — instantly shows which exact line repeats most. Empty lines are skipped, so you can paste full log files and quickly spot error patterns or canned responses.
What exactly does case-insensitive do?: Each token is normalised with `toLowerCase()` before counting and the table displays the normalised lowercase form. Turn it off to keep the original case.
How do I see only the top N?: All rows are shown in this version. Download the CSV and filter in Excel / Sheets, or scroll. For huge unique-token counts (tens of thousands) the browser can get sluggish — try a smaller sample first.
How is the share (%) calculated?: Occurrence count ÷ total tokens × 100, rounded to two decimals. In Characters mode with whitespace skipping, the denominator is the count after whitespace is removed.

Related tools

Character counter — chars / bytes / lines / words

Count characters, words, lines, paragraphs, and UTF-8 byte size in real time. Toggle whether whitespace and newlines are included. Progress bars show your text against common limits (tweets, 400-character genkō, etc.) — everything stays in your browser.

developertextcount

Line dedupe — keep unique or extract duplicates

Split the input by newline and remove duplicate lines, keeping only the first occurrence of each line in original order. Catches non-adjacent duplicates as well (equivalent to `awk '!seen[$0]++'`). Runs entirely in your browser.

developertext

CSV / text encoding converter — Shift_JIS ↔ UTF-8 / BOM / newlines

Re-encode CSV and text files between Shift_JIS (CP932), UTF-8, UTF-16LE and EUC-JP — fix Excel's mojibake on UTF-8, hand UTF-8 text to legacy systems that need Shift_JIS, or add BOM so Excel reads UTF-8 correctly. Add / remove BOM, swap newlines (CRLF / LF / CR), and auto-detect the input encoding. Batch convert and grab the result as a ZIP. Files never leave your device — everything runs in the browser.

developertextconversion

CSV stats — per-column count / unique / mean / median / stddev

Paste or drop a CSV and instantly see per-column row count, unique values, missing values, and inferred type. Numeric columns show min / max / mean / median / stddev / sum, text columns reveal the top mode and average length. RFC 4180-compliant parser (double-quote escapes), and the delimiter (comma / semicolon / tab / pipe) is auto-detected. Header row toggle plus empty / NULL / NA recognition as missing. Your raw data never leaves the browser.

developerextractcount