Yep! Doors have been OPENED 🤯 An open-source cousin of GPT-3 is here 😇
- Performs on par with 6.7B GPT-3
- Performs better and decodes faster than GPT-Neo
- repo + colab + free web demo
Got to know about it through Towards Data Science article: https://towardsdatascience.com/c...
More details in @arankomatsuzaki's article: https://arankomatsuzaki.wordpres...
@blakehunsicker yes, it can be fine-tuned at a rate of ~5000 tokens/second, which should be sufficient for small-to-medium-size datasets. Fine tuning instructions are here: https://github.com/kingoflolz/me...
Hey hope this is still relevant
I find gpt-j quite alright in the generation but it provides silly results when it does summaries. are there any experts here that could help on how i can maybe train here to provide tl;dr's
@pallpakk some results were definitely weird but overall, it works great! Negative sentiment, foul language, etc are context specific outputs. So if an input is negative/abusive itself, the output is bound to reinforce the same sentiment.
*GPT-J is just as good as GPT-3.* It is more efficient, but with more quirks. In our JPRED scores, it did better with simple TCS tasks, but lost with the more complex tasks.
By removing the Jordan Algorithm: Our next proposed change to a probability model is removing the Jordan Algorithm. The Jordan Algorithm is a special procedure used for simple TCS tasks that allows for fast analysis of different sequence pairs, as well as being able to easily analyze simple n-gram (aka word) models.It is more efficient, but with more quirks. In our JPRED scores, it did better with simple TCS tasks, but lost with the more complex tasks.
...
Product Hunt