# HG changeset patch # User HackBot # Date 1337264547 0 # Node ID fe852e72f4e2c46e42fe11f98af92b4ff4ef6f11 # Parent 15dec66d8533fd2d11bb38b07e8dd64f88cbf0a2 pastelogs markov assumption diff -r 15dec66d8533 -r fe852e72f4e2 paste/paste.11442 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/paste/paste.11442 Thu May 17 14:22:27 2012 +0000 @@ -0,0 +1,17 @@ +2011-08-26.txt:20:09:35: Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them. +2011-09-26.txt:13:03:19: CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption. +2011-09-26.txt:16:54:56: tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for which any trigram "z? " exists. For the final character, only consider such trigr +2011-12-23.txt:09:46:31: "säänellaan" -- broken vowel harmony 1, Markov assumption 0. +2012-05-17.txt:14:19:28: `pastlog markov assumption 0 +2012-05-17.txt:14:20:05: `pastlog markov assumption +2012-05-17.txt:14:20:16: 2011-08-26.txt:20:09:35: Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them. +2012-05-17.txt:14:20:32: `pastlog markov assumption +2012-05-17.txt:14:20:39: 2011-08-26.txt:20:09:35: Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them. +2012-05-17.txt:14:20:43: How many things involving the Markov assumption can you say, you speech recognition researcher? +2012-05-17.txt:14:20:45: `pastlog markov assumption +2012-05-17.txt:14:20:52: 2011-09-26.txt:16:54:56: tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for +2012-05-17.txt:14:21:04: `pastlog markov assumption +2012-05-17.txt:14:21:10: 2011-09-26.txt:13:03:19: CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption. +2012-05-17.txt:14:22:01: `pastlog markov assumption +2012-05-17.txt:14:22:09: 2011-09-26.txt:16:54:56: tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for +2012-05-17.txt:14:22:18: `pastelogs markov assumption