Mercurial > repo
view paste/paste.11442 @ 5674:91d4a5b788c9
<tswett> echo \'[11,11,11,15,15,23,12],[5,5,5,3,53,45,16,26,00,20,15,16,22,25,45,91,32,11,15,27,06,01,11,01,47,22,30,13,43,21,11,13,29,61,65,17,19,12,28,17,11,01,23,20,16,20,81,18,32,25,58,22.,1985,10.301350435,1555466973690094680980000956080767,13720946704494913791885940266665466978579582015128512190078...\' > wisdom/code
author | HackBot |
---|---|
date | Wed, 24 Jun 2015 14:47:46 +0000 |
parents | fe852e72f4e2 |
children |
line wrap: on
line source
2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them. 2011-09-26.txt:13:03:19: <fizzie> CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption. 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for which any trigram "z? " exists. For the final character, only consider such trigr 2011-12-23.txt:09:46:31: <fizzie> "säänellaan" -- broken vowel harmony 1, Markov assumption 0. 2012-05-17.txt:14:19:28: <elliott> `pastlog markov assumption 0 2012-05-17.txt:14:20:05: <elliott> `pastlog markov assumption 2012-05-17.txt:14:20:16: <HackEgo> 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them. 2012-05-17.txt:14:20:32: <elliott> `pastlog markov assumption 2012-05-17.txt:14:20:39: <HackEgo> 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them. 2012-05-17.txt:14:20:43: <elliott> How many things involving the Markov assumption can you say, you speech recognition researcher? 2012-05-17.txt:14:20:45: <elliott> `pastlog markov assumption 2012-05-17.txt:14:20:52: <HackEgo> 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for 2012-05-17.txt:14:21:04: <elliott> `pastlog markov assumption 2012-05-17.txt:14:21:10: <HackEgo> 2011-09-26.txt:13:03:19: <fizzie> CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption. 2012-05-17.txt:14:22:01: <elliott> `pastlog markov assumption 2012-05-17.txt:14:22:09: <HackEgo> 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for 2012-05-17.txt:14:22:18: <elliott> `pastelogs markov assumption