Here we have a great example of Simpson's paradox ;).
R is used almost 100% by scientific oriented programmers (statistical research, data science...) while python is only used by a fraction as a scientific language, as many people uses as a general purpose language (web programming, OS scripting, apps..). I am almost positive that comparing the people who uses both languages as a scientific oriented language you will obtain similar results (numpy, pandas...). I will try to understand this function.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-13885661492251314632016-12-18T10:04:06.686-08:002016-12-18T10:04:06.686-08:00My function: unfortunately :(, I think that the be...My function: unfortunately :(, I think that the best advice is to read documentation of all functions I use in R help.<br /><br />Generators: https://en.wikipedia.org/wiki/Generator_(computer_programming)Bogumił Kamińskihttps://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-41872857698365228842016-12-18T04:47:53.912-08:002016-12-18T04:47:53.912-08:00Thank you very much! I had attempted to generate a...Thank you very much! I had attempted to generate all subsets of {1,...,200} to test an assumption of EBIC, but I just found it would take at least 2^200 bytes to store its power sets. What should I read to understand your all.subsets.fast() function? What should I read to understand "generator"? I am a grad student in stats, have a bsc in math. Thank you very much!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-9285473693288582672016-12-08T00:39:17.578-08:002016-12-08T00:39:17.578-08:00Such a large structure simply does not fit into me...Such a large structure simply does not fit into memory.<br />You would have to use generator rather than materialized structure. Anyway 2^50=1125899906842624 so you will not be able to iterate over such a large number of elements anyway.Bogumił Kamińskihttps://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-7839262773817293912016-12-07T19:02:08.590-08:002016-12-07T19:02:08.590-08:00Hi, your code is much faster than the set_power() ...Hi, your code is much faster than the set_power() in sets package, but when I try all.subsets.fast(seq(1:50)), it says:<br />Error: cannot allocate vector of size 4194304.0 Gb. How can I fix it? Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-58607957929094905012016-12-06T22:45:44.517-08:002016-12-06T22:45:44.517-08:00Great post. Thanks. Great post. Thanks for a nicer version of gen_data func!
Regarding pam. The post is about optimization in general. pam is limited in functionality and do not handle side constraints. For a real life problem you can check my presentation I gave at EARL 2014 (https://goo.gl/YY1d31).
use function performance
ex...Use ROCR package in R<br />use function performance<br /><br />example<br /><br />library(ROCR)<br />pred<-prediction(actual,predicted)<br />perf<-performance(pred,"tpr","fpr")<br />plot(perf,col="red")<br />abline(0,1, lty = 8, col = "grey")<br /><br />auc<-performance(pred,"auc")<br />unlist(auc@y.values)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-76971942695951151442016-08-02T18:37:03.730-07:002016-08-02T18:37:03.730-07:00I was looking for something like a singleton patte...I was looking for something like a singleton pattern in R to keep objects that are expensive to load and this is the best approach I found so far. Thanks for that! <br />Previously I was playing with globals but this is messy and lintr was complaining about it.<br /><br />This is how I used it to load the biomaRt object.<br /><br />load_biomart <- function() {<br /> ensembl <- attr(load_biomart, "cached_ensembl")<br /> if (is.null(ensembl)) {<br /> ## Loads ensembl biomart for Homo sapiens ensembl = useMart('ensembl',dataset='hsapiens_gene_ensembl')<br /> futile.logger::flog.info("Loading biomart for the first time")<br /> ensembl_mart <- biomaRt::useMart("ENSEMBL_MART_ENSEMBL", host = "www.ensembl.org")<br /> dataset <- "hsapiens_gene_ensembl"<br /> ensembl <- biomaRt::useDataset(dataset, mart = ensembl_mart)<br /> attr(load_biomart, "cached_ensembl") <<- ensembl<br /> } else {<br /> futile.logger::flog.info("Returning cached biomart")<br /> }<br /> ensembl<br />}Chinohttps://www.blogger.com/profile/14034371996494176728noreply@blogger.com