Jan. 28, 2024, 4:31 p.m. | /u/bkffadia

Deep Learning www.reddit.com

I work with dna sequences as input to my deep learning model, I save them as one hot encoded numpy array in h5 file. My dataset has 700k examples and 500Go in size. I wanted to make training faster so I have a bunch of questions :

- is it better to store them as 1d arrays (numerical instead of one hot encoding) in h5 file then transform them to one hot encoded arrays during loading would this make things …

array dataset deep learning deeplearning dna examples faster file hot numpy questions save store them training work

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

Data Scientist, Mid

@ Booz Allen Hamilton | DEU, Stuttgart (Kurmaecker St)

Tech Excellence Data Scientist

@ Booz Allen Hamilton | Undisclosed Location - USA, VA, Mclean