all AI news
Time Complexity of Detach() in torch "[R]"
Oct. 5, 2022, 4:37 p.m. | /u/mishtimoi
Machine Learning www.reddit.com
So I have an empirical observation that when I train a large model vs. the same model in a staggered fashion, .i.e. some layers are frozen and others receive a gradient update, the latter takes more training time although the number of trainable parameters are less. This leads me to suspect that the detach() operation is the culprit. I cannot find much resources online to help me with understanding the time-complexity of the detach() operation in torch. Did anyone …
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Technology Consultant Master Data Management (w/m/d)
@ SAP | Walldorf, DE, 69190
Research Engineer, Computer Vision, Google Research
@ Google | Nairobi, Kenya