https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#head https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://www.nanopub.org/nschema#hasAssertion https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://www.nanopub.org/nschema#hasProvenance https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#provenance https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://www.nanopub.org/nschema#hasPublicationInfo https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#pubinfo https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.nanopub.org/nschema#Nanopublication https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://arxiv.org/abs/2406.17681 https://sense-nets.xyz/hasZoteroItemType preprint https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://purl.org/dc/terms/creator https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://purl.org/spar/cito/includesQuotationFrom https://x.com/neuranna/status/1791465842632454184 https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://purl.org/spar/cito/linksTo https://arxiv.org/abs/2406.17681 https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://www.w3.org/2000/01/rdf-schema#comment To reduce evaluation contamination @XuanmingZhang07 @Zhou_Yu_AI @columbianlp et al. convert dataset examples into templates(Fig.) https://arxiv.org/abs/2406.17681 EWOK datasets are built to have this trait https://x.com/neuranna/status/1791465842632454184 Interesting trend will it last? solve contamination? https://twitter.com/LChoshen/status/1806396147281637645/photo/1 @XuanmingZhang07 @Zhou_Yu_AI @columbianlp If you ask me, a nice step, but it only solves the worst contamination (clear training on the test set). Not on just training on similar formats, synthetic data etc. to improve. So it is a good approach that should last, but we need more. (@deliprao you had similar claim right?) https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://schema.org/keywords dataset-templates https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://schema.org/keywords evaluation-contamination https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://schema.org/keywords ewok-datasets https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://schema.org/keywords language-model-benchmarking https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://schema.org/keywords varbench https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion https://sense-nets.xyz/quotesPost https://x.com/neuranna/status/1791465842632454184 https://x.com/neuranna/status/1791465842632454184 https://sense-nets.xyz/hasZoteroItemType forumPost https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#provenance https://sense-nets.xyz/ http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/ns/prov#SoftwareAgent https://sense-nets.xyz/ http://www.w3.org/ns/prov#actedOnBehalfOf https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#activity http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://sense-nets.xyz/supervisedActivity https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#activity http://www.w3.org/ns/prov#wasAssociatedWith https://sense-nets.xyz/ https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://www.w3.org/ns/prov#linksTo https://x.com/LChoshen/status/1806396147281637645 https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://www.w3.org/ns/prov#wasAssociatedWith https://x.com/LChoshen https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://www.w3.org/ns/prov#wasAttributedTo https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#assertion http://www.w3.org/ns/prov#wasGeneratedBy https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#activity https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/account https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/account https://x.com/LChoshen https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#pubinfo https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#sig http://purl.org/nanopub/x/hasAlgorithm RSA https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#sig http://purl.org/nanopub/x/hasPublicKey MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#sig http://purl.org/nanopub/x/hasSignature IhJE7e2QCuhi9lBnA6zyNAFEnuD0Kq+6UXzbq4THcwqG0odW9IFvLzUeFrsO55KnIGA1Mz4O5TDx9CZvCLnkRxNYmCM4ikItw54oCAYwDE40zONWvcYAZpeSQvECknmIwaTEikPBENjFKF6BxgYEgtxCP0pMc37iuAvUHQq5uBBugkbSr8FgRFi3+3IIOEiiWANOxiioxtznNCG7VdaSnD1XkvGQGwoS7HxUdQCOvj+1cZBs4YnLk2ZAixo0AI0X9N3/ucT0om5uGRBtkMdluhggmrK7v6o4dxGDH6pV/RE5RpnxIjOcdtzczPF0MhpaLCZiPPYbzFkIb3WITivdmw== https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#sig http://purl.org/nanopub/x/hasSignatureTarget https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#sig http://purl.org/nanopub/x/singedBy https://sense-nets.xyz/ https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc#sig http://www.w3.org/ns/prov#wasAssociatedWith https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16VtssigningDelegation https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://purl.org/dc/terms/created 2024-09-12T18:53:31.101Z https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://purl.org/dc/terms/creator https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://purl.org/dc/terms/license https://creativecommons.org/licenses/by/4.0/ https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://purl.org/nanopub/x/hasNanopubType https://sense-nets.xyz/SemanticPost https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://purl.org/nanopub/x/wasCreatedAt https://sense-nets.xyz/ https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://www.w3.org/2000/01/rdf-schema#label CoSMO Semantic Post https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc http://www.w3.org/ns/prov#wasAttributedTo https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RA5onfai3TcQTXxloau--mcY6JKg8yXNeMmqo29rFn4Qc https://sense-nets.xyz/hasRootSigner 0xf6ECcfD463afB464dcC85b051DF2E93E2646E6D2 https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/account https://orcid.org/0000-0002-0085-6496 https://w3id.org/np/RAoSadUw99CeqDlR2400018nqTzR_38fT86OrTzk16Vts http://xmlns.com/foaf/0.1/name Leshem Choshen 🤖🤗 @ICML wanna talk?