Inspirated

 
 

June 5, 2010

HOWTO: Find interesting dictionary words with your Linux box

Filed under: Blog — krkhan @ 4:24 pm

Few *nix users are aware of existence of one /usr/share/dict/words on their machines. The original purpose of this file was to assist Unix programs in spell-checking. Now that every program that supports typo-prevention includes its own dictionaries, the words file no longer fares as something significant in the geek universe.

Nevertheless, the nifty gem can still serve as a fun place to find or coin new words based on lexicographical constraints. The omnipresent egrep command can be used to exploit the power of regular expressions against the English dictionary. Here’s how:

  • Find all words containing 6 or more characters which don’t contain any vowel, dot or dash:
    -bash-$ egrep -i '^[^aeiou.-]{6,}$' /usr/share/dict/words

    bkbndr
    BSDHyg
    BSFMgt
    BSGMgt
    BSPhTh
    crwths
    crypts
    Cynthy
    Cynwyd
    cywydd
    flybys
    Flysch
    flysch
    ftncmd
    ghylls
    glycyl
    glycyls
    glyphs
    gypsyfy
    gypsyry
    Khlyst
    Khlysts
    Khlysty
    Kylynn
    kyschty
    lymphs
    lymphy
    Lynndyl
    MSGMgt
    mtscmd
    myrrhs
    myrrhy
    Myrvyn
    Myrwyn
    nymphly
    nymphs
    pgnttrp
    Phyllys
    Phylys
    phytyl
    psychs
    pyrryl
    rhythm
    rhythms
    Schwyz
    spryly
    SSTTSS
    stddmp
    strych
    styryl
    sylphs
    sylphy
    symphysy
    synchs
    synths
    syzygy
    thymyl
    trysts
    tsktsk
    tsktsks
    tyddyn
    vyrnwy
    why’ll
    Wrycht
    WWMCCS
    xylyls

  • Find all words containing exactly 4 characters which can be spelled in pure Hexspeak, e.g., 0xDEADBEEF or 0xBABEFACE:
    -bash-$ egrep -i '^[abcdef]{4}$' /usr/share/dict/words

    AAAA
    AAEE
    abac
    Abad
    Abba
    abba
    Abbe
    abbe
    abed
    ACAA
    acad
    acca
    acce
    ACDA
    aced
    Adad
    adad
    Adda
    adda
    Adee
    AFCC
    affa
    Baba
    baba
    Babb
    Babe
    babe
    BAcc
    Badb
    bade
    BAEd
    baff
    bead
    Bebe
    Bede
    bede
    Beeb
    beef
    BFDC
    caba
    Cabe
    Caca
    caca
    cace
    CADD
    Cade
    cade
    CAFE
    cafe
    caff
    CDCF
    ceca
    Cece
    cede
    CFCA
    dabb
    Dace
    dace
    Dada
    dada
    Dade
    dade
    daff
    DBAC
    dead
    deaf
    debe
    decd
    deda
    dedd
    Dede
    deed
    Eada
    Eade
    EAFB
    Ebba
    ebcd
    ECAD
    ecad
    Ecca
    ecce
    EDAC
    Edda
    edda
    Edea
    edea
    Edee
    Faba
    Fabe
    FACD
    face
    fade
    faff
    FEAF
    Febe
    feeb
    feed
    feff

  • Find all words which contain ‘H’, ‘T’, ‘M’ and ‘L’ in precisely that order:
    egrep -i '^h.*t.*m.*l$' /usr/share/dict/words

    haemathermal
    haematothermal
    hemathermal
    hematothermal
    hepatoumbilical
    hephthemimeral
    heptametrical
    heteroecismal
    heteromeral
    heterothermal
    hexahydrothymol
    hippotomical
    histochemical
    histomorphological
    homeothermal
    homoiothermal
    homothermal
    hydrothermal
    hygrothermal
    hyperrhythmical
    hypersentimental
    hyperthermal
    hypertridimensional
    hypostomial
    hypothermal
    hysteromaniacal

  • Find all words containing ‘s’, ‘e’ and ‘x’ but at least one different character between each of them:
    -bash-$ egrep -i '^.*s[^sex]+e[^sex]+x.*$' /usr/share/dict/words

    antispermotoxin
    asterixis
    Asteroxylaceae
    Asteroxylon
    Cristineaux
    Erysipelothrix
    erysipelothrix
    Herstmonceux
    Hurstmonceux
    inspectrix
    Issy-les-Molineux
    Lisieux
    mesoappendix
    obstetrix
    pressure-fixing
    proces-verbaux
    salenixon
    salpingemphraxis
    salteaux
    saucebox
    sauceboxes
    sceuophylax
    scleronyxis
    scleroticonyxis
    scleroxanthin
    she-fox
    side-box
    sidebox
    Sideroxylon
    single-tax
    skeptophylaxia
    skeptophylaxis
    slipper-foxed
    smokebox
    sneakbox
    sore-pressedsore-taxed
    sore-taxed
    spectatrix
    speculatrix
    spermatoxin
    spermotoxin
    sphacelotoxin
    sphenomaxillary
    spice-box
    splanchnemphraxis
    splenauxe
    splenotoxin
    state-taxed
    stenothorax
    sternomaxillary
    sternoxiphoid
    stone-axe
    Streptothrix
    subbureaux
    sulfadimethoxine
    superaxillary
    superfix
    superfixes
    superflux
    supergalaxies
    supergalaxy
    superluxurious
    superluxuriously
    superluxuriousness
    supermaxilla
    supermaxillary
    supermixture
    superoxalate
    superoxide
    superoxygenate
    superoxygenated
    superoxygenating
    superoxygenation
    supertax
    supertaxation
    supertaxes
    sweatbox
    sweatboxes
    swine-pox
    swinepox
    swinepoxes
    Thrsieux

Now you can name your start-up company “SupErfiX” and hope that it will someday be acquired by Microsoft.

Tags: , , , , ,

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

One small verification for man, one giant PITA for bots: