. . . . "preprint" . . . . " To reduce evaluation contamination\n@XuanmingZhang07 @Zhou_Yu_AI @columbianlp et al.\nconvert dataset examples into templates(Fig.)\nhttps://arxiv.org/abs/2406.17681\n\nEWOK datasets are built to have this trait\nhttps://x.com/neuranna/status/1791465842632454184\nInteresting trend will it last? solve contamination? https://twitter.com/LChoshen/status/1806396147281637645/photo/1\n\n @XuanmingZhang07 @Zhou_Yu_AI @columbianlp If you ask me, a nice step, but it only solves the worst contamination (clear training on the test set). Not on just training on similar formats, synthetic data etc. to improve.\nSo it is a good approach that should last, but we need more. (@deliprao you had similar claim right?)\n\n" . "dataset-templates" . "evaluation-contamination" . "ewok-datasets" . "language-model-benchmarking" . "varbench" . . "forumPost" . . . . . . . . . . . . "RSA" . "MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB" . "IhJE7e2QCuhi9lBnA6zyNAFEnuD0Kq+6UXzbq4THcwqG0odW9IFvLzUeFrsO55KnIGA1Mz4O5TDx9CZvCLnkRxNYmCM4ikItw54oCAYwDE40zONWvcYAZpeSQvECknmIwaTEikPBENjFKF6BxgYEgtxCP0pMc37iuAvUHQq5uBBugkbSr8FgRFi3+3IIOEiiWANOxiioxtznNCG7VdaSnD1XkvGQGwoS7HxUdQCOvj+1cZBs4YnLk2ZAixo0AI0X9N3/ucT0om5uGRBtkMdluhggmrK7v6o4dxGDH6pV/RE5RpnxIjOcdtzczPF0MhpaLCZiPPYbzFkIb3WITivdmw==" . . . . "2024-09-12T18:53:31.101Z"^^ . . . . . "CoSMO Semantic Post" . . "0xf6ECcfD463afB464dcC85b051DF2E93E2646E6D2" . . "Leshem Choshen 🤖🤗 @ICML wanna talk?" .