. . . . . . . . " Scaling laws don't care about scale of the \"train\" models?\nDid anyone else get this? \nWhen I predict a scaling law, the scale of the largest model matters, but the num-models for fitting matters much much much more. \nInitial results, scaling error by #models starting from largest https://twitter.com/LChoshen/status/1803401845626511568/photo/1\n\n Maybe more simply put:\nYou can predict a scaling law with 8 small models, and it would be better than 3 large ones (that costs a lot)\n\nIs that something anyone else seen?\n\n" . "AI" . "cost" . "initialresults" . "models" . "modelscale" . "scalinglaws" . "training" . . . . . . . . . . . . "RSA" . "MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB" . "kwO4GqIhQeFzJGBGWRP9n9T+melmnkrd/EaUCcLuZScYqLWjoRAdThFjYLDjPNDEUtX77Ddf46qBmfw4Ydm9ksvPfRIyKj78nGliWcWESn8zdbCyr6h/ldezO7psXGlWqi4FeyLKsvfBC3fjPZh24pteD1VKWOhL4X4gUYfE+W7aKklx5pM3WmXq0DQefbaQXHpyq3PeMFiUPbmC4O92iRO1k0izQ2KWkNSJXr1Q7q8nwcoran09uRPYam8NUwt+zU8t/NRS+bGRC702gLi32ZZFKN9Q9XcmwFHHazArKdSDqZQleq/aJhXvmtmlIkZWF+1geA3bknx6thSnwP99og==" . . . . "2024-09-13T18:09:57.099Z"^^ . . . . . "CoSMO Semantic Post" . . "0xf6ECcfD463afB464dcC85b051DF2E93E2646E6D2" . . "Leshem Choshen 🤖🤗 @ICML wanna talk?" .