Aug. 10, 2022, 5:58 p.m. | /u/Silly_Ad_4008

Natural Language Processing www.reddit.com

Since embeddings in pytorch acts as lookup table, Is there any difference between these two codes?

(model.shared means embedding layer (T5 Transformer))

model.shared = new_emb
model.lm_head = new_head

and

model.shared.weight = new_emb.weight
model.lm_head.weight = new_head.weight

The reason that i am asking this question is:

When i use both, i get different loss values (cross-validation loss)

Loss for code piece 1: [https://i.stack.imgur.com/D0sz7.png](https://i.stack.imgur.com/D0sz7.png)

Loss for code piece 2:[https://i.stack.imgur.com/FvhMW.png](https://i.stack.imgur.com/FvhMW.png)

embedding languagetechnology pytorch transformers

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote