tag:blogger.com,1999:blog-4946490806848569840.comments2016-12-08T00:39:17.578-08:00R snippetsBogumił Kamińskinoreply@blogger.comBlogger198125tag:blogger.com,1999:blog-4946490806848569840.post-9285473693288582672016-12-08T00:39:17.578-08:002016-12-08T00:39:17.578-08:00Such a large structure simply does not fit into me...Such a large structure simply does not fit into memory.<br />You would have to use generator rather than materialized structure. Anyway 2^50=1125899906842624 so you will not be able to iterate over such a large number of elements anyway.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-7839262773817293912016-12-07T19:02:08.590-08:002016-12-07T19:02:08.590-08:00Hi, your code is much faster than the set_power() ...Hi, your code is much faster than the set_power() in sets package, but when I try all.subsets.fast(seq(1:50)), it says:<br />Error: cannot allocate vector of size 4194304.0 Gb. How can I fix it? Thanks!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-58607957929094905012016-12-06T22:45:44.517-08:002016-12-06T22:45:44.517-08:00Great post. Thanks. Great post. Thanks. 常青树http://www.blogger.com/profile/09499813030229291918noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-17136789747140136352016-12-06T08:56:54.244-08:002016-12-06T08:56:54.244-08:00Thanks for a nicer version of gen_data func!
Regar...Thanks for a nicer version of gen_data func!<br />Regarding pam. The post is about optimization in general. pam is limited in functionality and do not handle side constraints. For a real life problem you can check my presentation I gave at EARL 2014 (https://goo.gl/YY1d31). Wit Jakuczunhttp://www.blogger.com/profile/11026188487219287866noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-82417787117758010962016-12-06T05:47:53.382-08:002016-12-06T05:47:53.382-08:00An absolutely killer-feature of Julia JuMP is the ...An absolutely killer-feature of Julia JuMP is the ability to define lazy-constraints via callbacks in a solver-independent manner. Many of the best algorithms for problems make use of such techniques (such as subtour-elimination in Travelling Salesman) and the ability to easily program them is something that I have not seen in any library other than JuMP.Adam Sardarhttp://www.blogger.com/profile/07517057242770665359noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-14996400736357874222016-12-06T04:22:54.917-08:002016-12-06T04:22:54.917-08:00Thank you for the comment about ompr package. This...Thank you for the comment about ompr package. This is exactly what I have been looking for! I will have to check it out in detail, but it looks very good.<br /><br />As for the speed - most of the time is spent in solver and the solver is the same in both cases.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-73424061757968941182016-12-06T04:09:11.603-08:002016-12-06T04:09:11.603-08:00I really understand that glpkAPI has poor designed...I really understand that glpkAPI has poor designed interface in comparison with JuMP. But why R code is so unnecessary obfuscated and complicated in your example? Especially confusing is two type-castings to data.frame in 'gen_data'. In Julia you use matrix type for your data. Why don't you use matrix in R? Just for completeness - in real life R code for this tasks looks like this:<br /><br />gen_data = function(N, K, scaling) {<br /> alpha = 2 * pi * (1:N) / K<br /> sin_pi_K = 2 * sin(pi / K)<br /> x = matrix(rnorm(n = 2*N), nrow = N)<br /> x + cbind(cos(alpha), sin(alpha)) * scaling / sin_pi_K<br />}<br /><br />set.seed(1)<br />x = gen_data(200, 4, 5)<br /><br />library(cluster)<br />lapply(2:6, function(k) pam(x, k = k))Gregory Deminhttp://www.blogger.com/profile/09827917062640566527noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-42416569071838202612016-12-06T02:29:01.511-08:002016-12-06T02:29:01.511-08:00Great post! Nice to have the comparisons, pretty c...Great post! Nice to have the comparisons, pretty clear that Julia's syntax really wins the day here, is there much of a speed difference? <br /><br />Also, have you seen the ompr package (https://github.com/dirkschumacher/ompr) that Dirk Schumacher is working on? It's inspired by the JuMP project.Surferhttp://www.blogger.com/profile/10433810043220326846noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-50916168481686147142016-09-20T00:08:02.743-07:002016-09-20T00:08:02.743-07:00Use ROCR package in R
use function performance
ex...Use ROCR package in R<br />use function performance<br /><br />example<br /><br />library(ROCR)<br />pred<-prediction(actual,predicted)<br />perf<-performance(pred,"tpr","fpr")<br />plot(perf,col="red")<br />abline(0,1, lty = 8, col = "grey")<br /><br />auc<-performance(pred,"auc")<br />unlist(auc@y.values)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-76971942695951151442016-08-02T18:37:03.730-07:002016-08-02T18:37:03.730-07:00I was looking for something like a singleton patte...I was looking for something like a singleton pattern in R to keep objects that are expensive to load and this is the best approach I found so far. Thanks for that! <br />Previously I was playing with globals but this is messy and lintr was complaining about it.<br /><br />This is how I used it to load the biomaRt object.<br /><br />load_biomart <- function() {<br /> ensembl <- attr(load_biomart, "cached_ensembl")<br /> if (is.null(ensembl)) {<br /> ## Loads ensembl biomart for Homo sapiens ensembl = useMart('ensembl',dataset='hsapiens_gene_ensembl')<br /> futile.logger::flog.info("Loading biomart for the first time")<br /> ensembl_mart <- biomaRt::useMart("ENSEMBL_MART_ENSEMBL", host = "www.ensembl.org")<br /> dataset <- "hsapiens_gene_ensembl"<br /> ensembl <- biomaRt::useDataset(dataset, mart = ensembl_mart)<br /> attr(load_biomart, "cached_ensembl") <<- ensembl<br /> } else {<br /> futile.logger::flog.info("Returning cached biomart")<br /> }<br /> ensembl<br />}Chinohttp://www.blogger.com/profile/14034371996494176728noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-13060725982294530982016-06-17T11:35:24.373-07:002016-06-17T11:35:24.373-07:00I have just tested copying the code from the websi...I have just tested copying the code from the website and pasting it to R and it just works.<br />From what you have pasted it seems that you perform copy-paste incorrectly. Observe that the first quotation mark is different than the second after "xlab="Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-38520496445936809852016-06-17T11:25:19.519-07:002016-06-17T11:25:19.519-07:00Maybe I am a bit stupid but I cannot run it in R, ...Maybe I am a bit stupid but I cannot run it in R, I get many errors, the first one being:<br /><br />Error: unexpected input in:<br />" par(fin=c(4,4), fig=c(0,1,0,1))<br /> plot(x , y, axes = F, xlab=“"<br /><br />Josué Ortegahttp://www.blogger.com/profile/09954213911574064886noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-53430974543761589112016-04-23T23:43:59.651-07:002016-04-23T23:43:59.651-07:008 neighbors (so called Moore neighborhood)8 neighbors (so called Moore neighborhood)Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-416654436706979362016-04-23T18:08:05.118-07:002016-04-23T18:08:05.118-07:00Hi! Thank you for your code. Can you tell me how y...Hi! Thank you for your code. Can you tell me how you define radius? If you have a radius of 1, is that a radius of 1 agent? And does that mean each agent has 4 neighbors (left, right, up, down) in their radius of 1 or 8 neighbors (left, right, up, down, plus 4 corner agents)?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-9259232590870566222016-01-07T05:00:21.578-08:002016-01-07T05:00:21.578-08:00You would have to pass a properly wrapped base fun...You would have to pass a properly wrapped base functions, eg. pass<br />function(x) mean(x, na.rm = TRUE)<br />as an argument.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-45796409628150005222016-01-07T04:34:13.050-08:002016-01-07T04:34:13.050-08:00Hi, this is very useful. But what if the data has ...Hi, this is very useful. But what if the data has missing or NA values? How can we modify the code in order to account for NA values?Manhal Mohammad Alihttp://www.blogger.com/profile/02627312280931410360noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-26619224729858180332015-05-31T18:55:18.042-07:002015-05-31T18:55:18.042-07:00Holy blog, THANK YOU so much, have a pleasant day ...Holy blog, THANK YOU so much, have a pleasant day :D kurnia sholihathttp://www.blogger.com/profile/10202633937791251852noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-87749976293063045702015-05-29T06:52:12.718-07:002015-05-29T06:52:12.718-07:00I have other approach. There is only one model:
m...I have other approach. There is only one model:<br /><br />model <- lm(formula(paste("y ~", piece.formula("x", knots))))<br /><br />but it has hinge variables at knots which you can check when you execute:<br /><br />paste("y ~", piece.formula("x", knots))<br /><br />to get the formula:<br /><br />"y ~ x + I(pmax(x + 0.666666666666667, 0)) + I(pmax(x + 0.333333333333333, 0)) + I(pmax(x - 0, 0)) + I(pmax(x - 0.333333333333333, 0)) + I(pmax(x - 0.666666666666667, 0))"<br /><br />and the final model is:<br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 6.559 2.197 2.986 0.00530 ** <br />x 5.865 2.690 2.180 0.03647 * <br />I(pmax(x + 0.666666666666667, 0)) -22.189 4.520 -4.909 2.41e-05 ***<br />I(pmax(x + 0.333333333333333, 0)) 27.239 4.179 6.518 2.12e-07 ***<br />I(pmax(x - 0, 0)) -7.775 4.164 -1.867 0.07080 . <br />I(pmax(x - 0.333333333333333, 0)) -14.905 4.179 -3.567 0.00113 ** <br />I(pmax(x - 0.666666666666667, 0)) 15.860 4.520 3.509 0.00132 ** <br />Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-8719617301552114822015-05-29T03:15:02.844-07:002015-05-29T03:15:02.844-07:00Hi, thank you for the code, i find my knots much e...Hi, thank you for the code, i find my knots much easier using it, but i still don't understand how to find the models, if i have 2 knots, then shouldn't i have 3 models? Can you show me how to find the models using your code? Thank you very muchkurnia sholihathttp://www.blogger.com/profile/10202633937791251852noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-1505492409332568172015-01-31T00:09:41.551-08:002015-01-31T00:09:41.551-08:00You can start with learning parallel package.You can start with learning parallel package.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-25948653872085996502015-01-30T23:40:45.921-08:002015-01-30T23:40:45.921-08:00May I ask whether "sapply" is related to...May I ask whether "sapply" is related to parallel computing? I want to assign each subfunction to each core to compute the pvalues at the same time? Do you have any recommendation about that?Zhenchuan Wanghttp://www.blogger.com/profile/01825098384469325909noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-29366747241336304132015-01-30T23:22:44.197-08:002015-01-30T23:22:44.197-08:00I do not see your code, but my code works with sap...I do not see your code, but my code works with sapply. Probably you should replace it by other function appropriate for your case.Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-64001703806379302962015-01-30T23:07:54.856-08:002015-01-30T23:07:54.856-08:00If I want to apply four methods (subfunctions) to ...If I want to apply four methods (subfunctions) to a same sample (matrix) to calculate the p-values. What should I do? I notice that your function is only available to vectors.Zhenchuan Wanghttp://www.blogger.com/profile/01825098384469325909noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-44985011419278527982015-01-09T06:39:00.528-08:002015-01-09T06:39:00.528-08:00Thx for the comment. However, notice that I use ==...Thx for the comment. However, notice that I use == only to show the results and it is clear that you would not use == in practice.<br /><br />But consider two things (as an example):<br />1) should it matter if you generate from low to hi or from hi to low (I would say it should not matter and in Python it does)<br />2) should you have your generated sequence as evenly spaced as possible (I would say yes and for example in R you have 'just beyond' that breaks this this assumption)Bogumił Kamińskihttp://www.blogger.com/profile/06250268799809238730noreply@blogger.comtag:blogger.com,1999:blog-4946490806848569840.post-58579104520839009552015-01-09T06:06:42.198-08:002015-01-09T06:06:42.198-08:00Hi,
well, maybe I miss the point, why you prefer ...Hi,<br /><br />well, maybe I miss the point, why you prefer Julia. Julia is giving 6 times "false" and R produces 6 times FALSE. Equal. The point is, however, that every programmer should consider using "==" or ".==" with floating point numbers to be bad und to be avoided. Python producing the most unexspected behaviour has the biggest chance, that the programmer notices a bug in his code before delivering or real-case using it. +1 for python. <br />Once you learned to avoid "==" with floats, the topic discussed here is of no relevance anymore (or, is it?).<br /><br />Greets,<br />BernhardAnonymousnoreply@blogger.com