tag:blogger.com,1999:blog-4946490806848569840.comments2017-01-16T04:43:49.113-08:00R snippetsBogumił Kamińskinoreply@blogger.comBlogger202125tag:blogger.com,1999:blog-4946490806848569840.post-68266099081588945922017-01-16T04:43:49.113-08:002017-01-16T04:43:49.113-08:00great articlegreat articleAbhijit Chandahttp://www.blogger.com/profile/09188944739603626888noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-31156634314219241572016-12-18T22:07:06.510-08:002016-12-18T22:07:06.510-08:00Thank you very much for your help! I will try to u...Thank you very much for your help! I will try to understand this function.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-13885661492251314632016-12-18T10:04:06.686-08:002016-12-18T10:04:06.686-08:00My function: unfortunately :(, I think that the be...My function: unfortunately :(, I think that the best advice is to read documentation of all functions I use in R help.<br /><br />Generators: https://en.wikipedia.org/wiki/Generator_(computer_programming)Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-41872857698365228842016-12-18T04:47:53.912-08:002016-12-18T04:47:53.912-08:00Thank you very much! I had attempted to generate a...Thank you very much! I had attempted to generate all subsets of {1,...,200} to test an assumption of EBIC, but I just found it would take at least 2^200 bytes to store its power sets. What should I read to understand your all.subsets.fast() function? What should I read to understand "generator"? I am a grad student in stats, have a bsc in math. Thank you very much!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-9285473693288582672016-12-08T00:39:17.578-08:002016-12-08T00:39:17.578-08:00Such a large structure simply does not fit into me...Such a large structure simply does not fit into memory.<br />You would have to use generator rather than materialized structure. Anyway 2^50=1125899906842624 so you will not be able to iterate over such a large number of elements anyway.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-7839262773817293912016-12-07T19:02:08.590-08:002016-12-07T19:02:08.590-08:00Hi, your code is much faster than the set_power() ...Hi, your code is much faster than the set_power() in sets package, but when I try all.subsets.fast(seq(1:50)), it says:<br />Error: cannot allocate vector of size 4194304.0 Gb. How can I fix it? Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-58607957929094905012016-12-06T22:45:44.517-08:002016-12-06T22:45:44.517-08:00Great post. Thanks. Great post. Thanks. 常青树http://www.blogger.com/profile/09499813030229291918noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-17136789747140136352016-12-06T08:56:54.244-08:002016-12-06T08:56:54.244-08:00Thanks for a nicer version of gen_data func!
Regar...Thanks for a nicer version of gen_data func!<br />Regarding pam. The post is about optimization in general. pam is limited in functionality and do not handle side constraints. For a real life problem you can check my presentation I gave at EARL 2014 (https://goo.gl/YY1d31). Wit Jakuczunhttp://www.blogger.com/profile/11026188487219287866noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-82417787117758010962016-12-06T05:47:53.382-08:002016-12-06T05:47:53.382-08:00An absolutely killer-feature of Julia JuMP is the ...An absolutely killer-feature of Julia JuMP is the ability to define lazy-constraints via callbacks in a solver-independent manner. Many of the best algorithms for problems make use of such techniques (such as subtour-elimination in Travelling Salesman) and the ability to easily program them is something that I have not seen in any library other than JuMP.Adam Sardarhttp://www.blogger.com/profile/07517057242770665359noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-14996400736357874222016-12-06T04:22:54.917-08:002016-12-06T04:22:54.917-08:00Thank you for the comment about ompr package. This...Thank you for the comment about ompr package. This is exactly what I have been looking for! I will have to check it out in detail, but it looks very good.<br /><br />As for the speed - most of the time is spent in solver and the solver is the same in both cases.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-73424061757968941182016-12-06T04:09:11.603-08:002016-12-06T04:09:11.603-08:00I really understand that glpkAPI has poor designed...I really understand that glpkAPI has poor designed interface in comparison with JuMP. But why R code is so unnecessary obfuscated and complicated in your example? Especially confusing is two type-castings to data.frame in 'gen_data'. In Julia you use matrix type for your data. Why don't you use matrix in R? Just for completeness - in real life R code for this tasks looks like this:<br /><br />gen_data = function(N, K, scaling) {<br /> alpha = 2 * pi * (1:N) / K<br /> sin_pi_K = 2 * sin(pi / K)<br /> x = matrix(rnorm(n = 2*N), nrow = N)<br /> x + cbind(cos(alpha), sin(alpha)) * scaling / sin_pi_K<br />}<br /><br />set.seed(1)<br />x = gen_data(200, 4, 5)<br /><br />library(cluster)<br />lapply(2:6, function(k) pam(x, k = k))Gregory Deminhttp://www.blogger.com/profile/09827917062640566527noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-42416569071838202612016-12-06T02:29:01.511-08:002016-12-06T02:29:01.511-08:00Great post! Nice to have the comparisons, pretty c...Great post! Nice to have the comparisons, pretty clear that Julia's syntax really wins the day here, is there much of a speed difference? <br /><br />Also, have you seen the ompr package (https://github.com/dirkschumacher/ompr) that Dirk Schumacher is working on? It's inspired by the JuMP project.Surferhttp://www.blogger.com/profile/10433810043220326846noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-50916168481686147142016-09-20T00:08:02.743-07:002016-09-20T00:08:02.743-07:00Use ROCR package in R
use function performance
ex...Use ROCR package in R<br />use function performance<br /><br />example<br /><br />library(ROCR)<br />pred<-prediction(actual,predicted)<br />perf<-performance(pred,"tpr","fpr")<br />plot(perf,col="red")<br />abline(0,1, lty = 8, col = "grey")<br /><br />auc<-performance(pred,"auc")<br />unlist(auc@y.values)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-76971942695951151442016-08-02T18:37:03.730-07:002016-08-02T18:37:03.730-07:00I was looking for something like a singleton patte...I was looking for something like a singleton pattern in R to keep objects that are expensive to load and this is the best approach I found so far. Thanks for that! <br />Previously I was playing with globals but this is messy and lintr was complaining about it.<br /><br />This is how I used it to load the biomaRt object.<br /><br />load_biomart <- function() {<br /> ensembl <- attr(load_biomart, "cached_ensembl")<br /> if (is.null(ensembl)) {<br /> ## Loads ensembl biomart for Homo sapiens ensembl = useMart('ensembl',dataset='hsapiens_gene_ensembl')<br /> futile.logger::flog.info("Loading biomart for the first time")<br /> ensembl_mart <- biomaRt::useMart("ENSEMBL_MART_ENSEMBL", host = "www.ensembl.org")<br /> dataset <- "hsapiens_gene_ensembl"<br /> ensembl <- biomaRt::useDataset(dataset, mart = ensembl_mart)<br /> attr(load_biomart, "cached_ensembl") <<- ensembl<br /> } else {<br /> futile.logger::flog.info("Returning cached biomart")<br /> }<br /> ensembl<br />}Chinohttp://www.blogger.com/profile/14034371996494176728noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-13060725982294530982016-06-17T11:35:24.373-07:002016-06-17T11:35:24.373-07:00I have just tested copying the code from the websi...I have just tested copying the code from the website and pasting it to R and it just works.<br />From what you have pasted it seems that you perform copy-paste incorrectly. Observe that the first quotation mark is different than the second after "xlab="Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-38520496445936809852016-06-17T11:25:19.519-07:002016-06-17T11:25:19.519-07:00Maybe I am a bit stupid but I cannot run it in R, ...Maybe I am a bit stupid but I cannot run it in R, I get many errors, the first one being:<br /><br />Error: unexpected input in:<br />" par(fin=c(4,4), fig=c(0,1,0,1))<br /> plot(x , y, axes = F, xlab=“"<br /><br />Josué Ortegahttp://www.blogger.com/profile/09954213911574064886noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-53430974543761589112016-04-23T23:43:59.651-07:002016-04-23T23:43:59.651-07:008 neighbors (so called Moore neighborhood)8 neighbors (so called Moore neighborhood)Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-416654436706979362016-04-23T18:08:05.118-07:002016-04-23T18:08:05.118-07:00Hi! Thank you for your code. Can you tell me how y...Hi! Thank you for your code. Can you tell me how you define radius? If you have a radius of 1, is that a radius of 1 agent? And does that mean each agent has 4 neighbors (left, right, up, down) in their radius of 1 or 8 neighbors (left, right, up, down, plus 4 corner agents)?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-9259232590870566222016-01-07T05:00:21.578-08:002016-01-07T05:00:21.578-08:00You would have to pass a properly wrapped base fun...You would have to pass a properly wrapped base functions, eg. pass<br />function(x) mean(x, na.rm = TRUE)<br />as an argument.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-45796409628150005222016-01-07T04:34:13.050-08:002016-01-07T04:34:13.050-08:00Hi, this is very useful. But what if the data has ...Hi, this is very useful. But what if the data has missing or NA values? How can we modify the code in order to account for NA values?Manhal Mohammad Alihttp://www.blogger.com/profile/02627312280931410360noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-26619224729858180332015-05-31T18:55:18.042-07:002015-05-31T18:55:18.042-07:00Holy blog, THANK YOU so much, have a pleasant day ...Holy blog, THANK YOU so much, have a pleasant day :D kurnia sholihathttp://www.blogger.com/profile/10202633937791251852noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-87749976293063045702015-05-29T06:52:12.718-07:002015-05-29T06:52:12.718-07:00I have other approach. There is only one model:
m...I have other approach. There is only one model:<br /><br />model <- lm(formula(paste("y ~", piece.formula("x", knots))))<br /><br />but it has hinge variables at knots which you can check when you execute:<br /><br />paste("y ~", piece.formula("x", knots))<br /><br />to get the formula:<br /><br />"y ~ x + I(pmax(x + 0.666666666666667, 0)) + I(pmax(x + 0.333333333333333, 0)) + I(pmax(x - 0, 0)) + I(pmax(x - 0.333333333333333, 0)) + I(pmax(x - 0.666666666666667, 0))"<br /><br />and the final model is:<br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 6.559 2.197 2.986 0.00530 ** <br />x 5.865 2.690 2.180 0.03647 * <br />I(pmax(x + 0.666666666666667, 0)) -22.189 4.520 -4.909 2.41e-05 ***<br />I(pmax(x + 0.333333333333333, 0)) 27.239 4.179 6.518 2.12e-07 ***<br />I(pmax(x - 0, 0)) -7.775 4.164 -1.867 0.07080 . <br />I(pmax(x - 0.333333333333333, 0)) -14.905 4.179 -3.567 0.00113 ** <br />I(pmax(x - 0.666666666666667, 0)) 15.860 4.520 3.509 0.00132 ** <br />Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-8719617301552114822015-05-29T03:15:02.844-07:002015-05-29T03:15:02.844-07:00Hi, thank you for the code, i find my knots much e...Hi, thank you for the code, i find my knots much easier using it, but i still don't understand how to find the models, if i have 2 knots, then shouldn't i have 3 models? Can you show me how to find the models using your code? Thank you very muchkurnia sholihathttp://www.blogger.com/profile/10202633937791251852noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-1505492409332568172015-01-31T00:09:41.551-08:002015-01-31T00:09:41.551-08:00You can start with learning parallel package.You can start with learning parallel package.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-25948653872085996502015-01-30T23:40:45.921-08:002015-01-30T23:40:45.921-08:00May I ask whether "sapply" is related to...May I ask whether "sapply" is related to parallel computing? I want to assign each subfunction to each core to compute the pvalues at the same time? Do you have any recommendation about that?Zhenchuan Wanghttp://www.blogger.com/profile/01825098384469325909noreply@blogger.com