My research area has been in time series forecasting and unsupervised anomaly detection, but it is SOMEWHAT related to NLP.
Papers with code had a few potential implementations: https://paperswithcode.com/paper/hyena-hierarchy-towards-larger-convolutional
I am always skeptical of papers. They could have good results, but how much did they adjust their experiment to look good on paper?