June 13, 2022, 7:06 a.m. | /u/chillingfox123

Data Science www.reddit.com

Do you put everything in functions or even classes?

I’m trying to go from a self-taught noob to someone who’s not completely clueless and wondering what the best practice is.

Found that, especially for larger projects, putting stuff in functions has been a game-changer because otherwise I’d have loads of random variables polluting the global namespace e.g. `df_clean`, `df_clean_na_drop`,`fig_value_count_XYZ`.

What’s the benefit of using classes instead of functions for this purpose? Any other pro tips?

analysis data data cleaning datascience eda good habits project python workflow

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote