view paste/paste.11442 @ 5461:1747ab989893

<oerjan> le/rn ocean/The Pacific Ocean is half the world and surrounded by fire. The Atlantic Ocean is less cool than its giant underwater mountain range. The Arctic Ocean is cold. The Indian Ocean is full of typhoons and non-Eurocentric shipping.
author HackBot
date Sun, 07 Jun 2015 16:36:01 +0000
parents fe852e72f4e2
children
line wrap: on
line source

2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them.
2011-09-26.txt:13:03:19: <fizzie> CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption.
2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for which any trigram "z? " exists. For the final character, only consider such trigr
2011-12-23.txt:09:46:31: <fizzie> "säänellaan" -- broken vowel harmony 1, Markov assumption 0.
2012-05-17.txt:14:19:28: <elliott> `pastlog markov assumption 0
2012-05-17.txt:14:20:05: <elliott> `pastlog markov assumption
2012-05-17.txt:14:20:16: <HackEgo> 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them.
2012-05-17.txt:14:20:32: <elliott> `pastlog markov assumption
2012-05-17.txt:14:20:39: <HackEgo> 2011-08-26.txt:20:09:35: <fizzie> Given that what you get from an n-gram is (n-1) words of context, I think it's pretty safe bet to say that the Markov assumption (of order n-1) will hold for most things you do with them.
2012-05-17.txt:14:20:43: <elliott> How many things involving the Markov assumption can you say, you speech recognition researcher?
2012-05-17.txt:14:20:45: <elliott> `pastlog markov assumption
2012-05-17.txt:14:20:52: <HackEgo> 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for
2012-05-17.txt:14:21:04: <elliott> `pastlog markov assumption
2012-05-17.txt:14:21:10: <HackEgo> 2011-09-26.txt:13:03:19: <fizzie> CakeProphet: Certainly there are different ways to do language models; I just can't offhand figure out how to make a (sensible) language model that would use n-grams but not have the (n-1)-order Markov assumption.
2012-05-17.txt:14:22:01: <elliott> `pastlog markov assumption
2012-05-17.txt:14:22:09: <HackEgo> 2011-09-26.txt:16:54:56: <fizzie> tehporPekaC: There's an alternative solution which will always hit the target length, and thanks to the Markov assumption really shouldn't affect the distribution of the last characters of a word: when generating a word of length K with trigrams, first generate K-2 characters so that you ignore all "xy " entries. For the penultimate character, only consider such trigrams "xyz" for
2012-05-17.txt:14:22:18: <elliott> `pastelogs markov assumption