https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#head https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://www.nanopub.org/nschema#hasAssertion https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://www.nanopub.org/nschema#hasProvenance https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#provenance https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://www.nanopub.org/nschema#hasPublicationInfo https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#pubinfo https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.nanopub.org/nschema#Nanopublication https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://purl.org/dc/terms/creator https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://schema.org/Claim https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://schema.org/Observation https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://schema.org/Question https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/2000/01/rdf-schema#comment Scaling laws don't care about scale of the "train" models? Did anyone else get this? When I predict a scaling law, the scale of the largest model matters, but the num-models for fitting matters much much much more. Initial results, scaling error by #models starting from largest https://twitter.com/LChoshen/status/1803401845626511568/photo/1 Maybe more simply put: You can predict a scaling law with 8 small models, and it would be better than 3 large ones (that costs a lot) Is that something anyone else seen? https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords AI https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords cost https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords initialresults https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords models https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords modelscale https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords scalinglaws https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion https://schema.org/keywords training https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#provenance https://sense-nets.xyz/ http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/ns/prov#SoftwareAgent https://sense-nets.xyz/ http://www.w3.org/ns/prov#actedOnBehalfOf https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#activity http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://sense-nets.xyz/supervisedActivity https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#activity http://www.w3.org/ns/prov#wasAssociatedWith https://sense-nets.xyz/ https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/ns/prov#linksTo https://x.com/LChoshen/status/1803401845626511568 https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/ns/prov#wasAssociatedWith https://x.com/LChoshen https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/ns/prov#wasAttributedTo https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#assertion http://www.w3.org/ns/prov#wasGeneratedBy https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#activity https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/account https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/account https://x.com/LChoshen https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#pubinfo https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig http://purl.org/nanopub/x/hasAlgorithm RSA https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig http://purl.org/nanopub/x/hasPublicKey MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig http://purl.org/nanopub/x/hasSignature kwO4GqIhQeFzJGBGWRP9n9T+melmnkrd/EaUCcLuZScYqLWjoRAdThFjYLDjPNDEUtX77Ddf46qBmfw4Ydm9ksvPfRIyKj78nGliWcWESn8zdbCyr6h/ldezO7psXGlWqi4FeyLKsvfBC3fjPZh24pteD1VKWOhL4X4gUYfE+W7aKklx5pM3WmXq0DQefbaQXHpyq3PeMFiUPbmC4O92iRO1k0izQ2KWkNSJXr1Q7q8nwcoran09uRPYam8NUwt+zU8t/NRS+bGRC702gLi32ZZFKN9Q9XcmwFHHazArKdSDqZQleq/aJhXvmtmlIkZWF+1geA3bknx6thSnwP99og== https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig http://purl.org/nanopub/x/hasSignatureTarget https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig http://purl.org/nanopub/x/singedBy https://sense-nets.xyz/ https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE#sig http://www.w3.org/ns/prov#wasAssociatedWith https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16VtssigningDelegation https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://purl.org/dc/terms/created 2024-09-13T18:09:57.099Z https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://purl.org/dc/terms/creator https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://purl.org/dc/terms/license https://creativecommons.org/licenses/by/4.0/ https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://purl.org/nanopub/x/hasNanopubType https://sense-nets.xyz/SemanticPost https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://purl.org/nanopub/x/wasCreatedAt https://sense-nets.xyz/ https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://www.w3.org/2000/01/rdf-schema#label CoSMO Semantic Post https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RAmEQd6Xc3YHZ6uH6_kOfPKx1SDVV-wn6aK8LrvcmNyZE https://sense-nets.xyz/hasRootSigner 0xf6ECcfD463afB464dcC85b051DF2E93E2646E6D2 https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/account https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/name Leshem Choshen 🤖🤗 @ICML wanna talk?