An approach where two separate networks are trained simultaneously on the same task, and the gradients from each are used to update the other, potentially leading to better generalization. 27.07.2023 17:54 aior