Finding Anagrams from a list of words in Python

	Finding Anagrams from a list of words in Python
	Publishing date: 2023-05-12 10:28 +0200

	I'm kind of obsessed with historic cryptography and puzzles.
	A week ago or so I had to find anagrams for a given word and
	although you could use your favorite search engine to look
	up an existing list for a given language - or even fancier,
	using ChatGPT - I decided to cook it up my own.

	First, an anagram isn't just a simple random permutation, it
	must also be a proper word existing in the language. While
	one could simply do something like

	```
	In [21]: iword = list("hello")
	In [22]: shuffle(iword)
	In [23]: ''.join(iword)
	Out[23]: 'ollhe'
	```

	this isn't exactly helpful.

	So what I'm doing instead is reading in a list of words into
	a list of strings, then sort the word I want to find
	anagrams for by the ASCII-value of each individual
	characters and then look for words in the list matching the
	same pattern. Example:

	```
	In [25]: [ord(c) for c in 'hello']
	Out[25]: [104, 101, 108, 108, 111]
	In [29]: o = [ord(c) for c in 'hello']
	In [30]: o.sort(); o
	Out[30]: [101, 104, 108, 108, 111]
	```

	IF an anagram exists, then there should be at least two
	words in the list, which follow the same pattern. To
	accommodate for upper-/lowercase characters, I make all
	characters lowercase first.

	So, first, read in a list of words - with one word per
	line - and put it into a list:

	```
	en = []
	with open("/home/alex/share/wordlists/english.txt") as f:
	while True:
	line=f.readline()
	if not line:
	break
	else:
	en += [ line.strip('\n') ]
	en[0:5]
	['W', 'w', 'WW', 'WWW', 'WY']
	```

	Alrighty... Now the fun:

	```
	def findAnagram(word, wl):
	"""Find an anagram for word in wordlist wl.
	wl must be python list of words (as strings).
	A wordlist can be generated by reading a flat
	text file containing words,
	e.g. by using the helper function
	gen_wordlist_list_from_file().
	"""
	# The idea is to grab all words of the same
	# length, then sort the characters and get an
	# ascii representation; then find all
	# which have the same representation.
	word = word.lower()
	tmp_wl = [i for i in wl if len(i) == len(word)]
	enc_word = [ord(i) for i in word]
	enc_word.sort()
	out = []
	for i in tmp_wl:
	i = i.lower()
	t = [ord(x) for x in i]
	t.sort()
	if enc_word == t:
	out += [ i ]
	return out
	```

	Let's try this!

	```
	[findAnagram(word, en) for word in "How does this \
	even work".split(" ")]
	[['how', 'who', 'who'],
	['odes', 'does', 'dose'],
	['this', 'hist', 'hits', 'shit'],
	['even'],
	['work']]
	```

	Fun!
	___________________________________________________________________
	Gophered by Gophernicus/3.1.1 on Raspbian/12 armv7l