/repo: paste/paste.11442 annotate

annotate paste/paste.11442 @ 9285:8320c9c4620f

<oerjan> learn Umlaut is German for "hum aloud", an important feature of the German language. It is indicated by putting two dots over the vowel of the syllable.

author	HackBot
date	Sat, 15 Oct 2016 00:04:47 +0000
parents	fe852e72f4e2
children

rev	line source
397 fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	1 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them.
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	2 2011-09-26.txt:13:03:19: <fizzie> CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption.
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	3 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for which any trigram "z? " exists. For the final character, only consider such trigr
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	4 2011-12-23.txt:09:46:31: <fizzie> "säänellaan" -- broken vowel harmony 1, Markov assumption 0.
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	5 2012-05-17.txt:14:19:28: <elliott> `pastlog markov assumption 0
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	6 2012-05-17.txt:14:20:05: <elliott> `pastlog markov assumption
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	7 2012-05-17.txt:14:20:16: <HackEgo> 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them.
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	8 2012-05-17.txt:14:20:32: <elliott> `pastlog markov assumption
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	9 2012-05-17.txt:14:20:39: <HackEgo> 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them.
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	10 2012-05-17.txt:14:20:43: <elliott> How many things involving the Markov assumption can you say, you speech recognition researcher?
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	11 2012-05-17.txt:14:20:45: <elliott> `pastlog markov assumption
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	12 2012-05-17.txt:14:20:52: <HackEgo> 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	13 2012-05-17.txt:14:21:04: <elliott> `pastlog markov assumption
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	14 2012-05-17.txt:14:21:10: <HackEgo> 2011-09-26.txt:13:03:19: <fizzie> CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption.
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	15 2012-05-17.txt:14:22:01: <elliott> `pastlog markov assumption
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	16 2012-05-17.txt:14:22:09: <HackEgo> 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for
fe852e72f4e2 <elliott> pastelogs markov assumption HackBot parents: diff changeset	17 2012-05-17.txt:14:22:18: <elliott> `pastelogs markov assumption