HOWTO: Find interesting dictionary words with your Linux box
Few *nix users are aware of existence of one /usr/share/dict/words
on their machines. The original purpose of this file was to assist Unix programs in spell-checking. Now that every program that supports typo-prevention includes its own dictionaries, the words
file no longer fares as something significant in the geek universe.
Nevertheless, the nifty gem can still serve as a fun place to find or coin new words based on lexicographical constraints. The omnipresent egrep
command can be used to exploit the power of regular expressions against the English dictionary. Here’s how:
- Find all words containing 6 or more characters which don’t contain any vowel, dot or dash:
-bash-$ egrep -i '^[^aeiou.-]{6,}$' /usr/share/dict/words
bkbndr
BSDHyg
BSFMgt
BSGMgt
BSPhTh
crwths
crypts
Cynthy
Cynwyd
cywydd
flybys
Flysch
flysch
ftncmd
ghylls
glycyl
glycyls
glyphs
gypsyfy
gypsyry
Khlyst
Khlysts
Khlysty
Kylynn
kyschty
lymphs
lymphy
Lynndyl
MSGMgt
mtscmd
myrrhs
myrrhy
Myrvyn
Myrwyn
nymphly
nymphs
pgnttrp
Phyllys
Phylys
phytyl
psychs
pyrryl
rhythm
rhythms
Schwyz
spryly
SSTTSS
stddmp
strych
styryl
sylphs
sylphy
symphysy
synchs
synths
syzygy
thymyl
trysts
tsktsk
tsktsks
tyddyn
vyrnwy
why’ll
Wrycht
WWMCCS
xylyls - Find all words containing exactly 4 characters which can be spelled in pure Hexspeak, e.g.,
0xDEADBEEF
or0xBABEFACE
:-bash-$ egrep -i '^[abcdef]{4}$' /usr/share/dict/words
AAAA
AAEE
abac
Abad
Abba
abba
Abbe
abbe
abed
ACAA
acad
acca
acce
ACDA
aced
Adad
adad
Adda
adda
Adee
AFCC
affa
Baba
baba
Babb
Babe
babe
BAcc
Badb
bade
BAEd
baff
bead
Bebe
Bede
bede
Beeb
beef
BFDC
caba
Cabe
Caca
caca
cace
CADD
Cade
cade
CAFE
cafe
caff
CDCF
ceca
Cece
cede
CFCA
dabb
Dace
dace
Dada
dada
Dade
dade
daff
DBAC
dead
deaf
debe
decd
deda
dedd
Dede
deed
Eada
Eade
EAFB
Ebba
ebcd
ECAD
ecad
Ecca
ecce
EDAC
Edda
edda
Edea
edea
Edee
Faba
Fabe
FACD
face
fade
faff
FEAF
Febe
feeb
feed
feff - Find all words which contain ‘H’, ‘T’, ‘M’ and ‘L’ in precisely that order:
egrep -i '^h.*t.*m.*l$' /usr/share/dict/words
haemathermal
haematothermal
hemathermal
hematothermal
hepatoumbilical
hephthemimeral
heptametrical
heteroecismal
heteromeral
heterothermal
hexahydrothymol
hippotomical
histochemical
histomorphological
homeothermal
homoiothermal
homothermal
hydrothermal
hygrothermal
hyperrhythmical
hypersentimental
hyperthermal
hypertridimensional
hypostomial
hypothermal
hysteromaniacal - Find all words containing ‘s’, ‘e’ and ‘x’ but at least one different character between each of them:
-bash-$ egrep -i '^.*s[^sex]+e[^sex]+x.*$' /usr/share/dict/words
antispermotoxin
asterixis
Asteroxylaceae
Asteroxylon
Cristineaux
Erysipelothrix
erysipelothrix
Herstmonceux
Hurstmonceux
inspectrix
Issy-les-Molineux
Lisieux
mesoappendix
obstetrix
pressure-fixing
proces-verbaux
salenixon
salpingemphraxis
salteaux
saucebox
sauceboxes
sceuophylax
scleronyxis
scleroticonyxis
scleroxanthin
she-fox
side-box
sidebox
Sideroxylon
single-tax
skeptophylaxia
skeptophylaxis
slipper-foxed
smokebox
sneakbox
sore-pressedsore-taxed
sore-taxed
spectatrix
speculatrix
spermatoxin
spermotoxin
sphacelotoxin
sphenomaxillary
spice-box
splanchnemphraxis
splenauxe
splenotoxin
state-taxed
stenothorax
sternomaxillary
sternoxiphoid
stone-axe
Streptothrix
subbureaux
sulfadimethoxine
superaxillary
superfix
superfixes
superflux
supergalaxies
supergalaxy
superluxurious
superluxuriously
superluxuriousness
supermaxilla
supermaxillary
supermixture
superoxalate
superoxide
superoxygenate
superoxygenated
superoxygenating
superoxygenation
supertax
supertaxation
supertaxes
sweatbox
sweatboxes
swine-pox
swinepox
swinepoxes
Thrsieux
Now you can name your start-up company “SupErfiX” and hope that it will someday be acquired by Microsoft.
Tags: Dictionary, Easter Egg, Fun, Linux, Open Source, Tutorial