Leoluyi 呂奕 Coding with Data



[Git] 修正中文顯示亂碼

在 Mac 使用 git statusgit ls-files 時,若 Git 在顯示中文檔名出現類似下面的亂碼:


這是因為 Git 預設只會印出 non-ASCII 字串,對於 utf8 的檔名或訊息,會用 quoted octal notation 印出。

Git has always used octal utf8 display, and one way to show the actual name is by using printf in a bash shell.

Since Git 1.7.10 introduced the support of unicode, this wiki page mentions:

By default, git will print non-ASCII file names in quoted octal notation, i.e. “\nnn\nnn…”.

This can be disabled with:

git config [--global] core.quotepath off



Read more

[R] Using TOR in R

這篇要寫得比較隱晦一些。有時候需要 TOR 來隱藏自己的 IP,然而在 R 裡面要如何辦到呢?

Install TOR – 有洋蔥

TOR installation guide

Use TOR in R

TOR 是用 SOCKS5 proxy server,所以在 R 常見的連線 curl 套件 (是 httr 底層連線的實作),可以用幾個方式完成:

1. Set proxy in curl

在 handle 物件加入 proxy:

h <- new_handle(proxy = "socks5://localhost:9050")

2. Set proxy in httr

res <- GET("https://httpbin.org/get",

或是用 global httr configuration:

  use_proxy(url="socks5://localhost", port=9050)

# 重設 global configuration

3. Set proxy globally (case sensitive!)

直接修改環境變數,可在該 Session 中的連線用到:

Sys.setenv(HTTP_PROXY = "socks5://localhost:9050")
Sys.setenv(HTTPS_PROXY = "socks5://localhost:9050")

# 測試 Even works in base R
readLines(base::url("https://httpbin.org/get", method = "libcurl"))

Full sample code



Read more

[R] Fixing "Peer certificate cannot be authenticated"

On Windows machine:

Error in curl::curl_fetch_memory(url, handle = handle) :
  Peer certificate cannot be authenticated with given CA certificates

The machine in question is sitting behind a gnarly firewall and proxy, which I suspect are the source of the problem. I also need to use --ignore-certificate-errors when running chromium-browser, which points to the same issue.


set_config(config(ssl_verifypeer = 0L))

https://www.r-bloggers.com/fixing-peer-certificate-cannot-be-authenticated/ https://github.com/jimhester/gmailr/issues/44

Read more

[R] Package development - load dependencies

在開發 R 套件的時候,如果想要在開發環境裡載入 package dependencies,原本需要手工的方式一個個加進去。 後來發現 roxygen2 裡面有一個 function load_pkg_dependencies() 原本是用來測試套件的載入,這邊可作為我們一個簡單設置環境的工具:


看源碼可知道他是利用 read_pkg_description() 讀取套件路徑裡的 DESCRIPTION 檔案,載入 Depends, Imports

function (path)
    desc <- read_pkg_description(path)
    pkgs <- paste(c(desc$Depends, desc$Imports), collapse = ", ")
    if (pkgs == "")
    pkgs <- str_replace_all(pkgs, "\\(.*?\\)", "")
    pkgs <- str_split(pkgs, ",")[[1]]
    pkgs <- str_trim(pkgs)
    lapply(pkgs[pkgs != "R"], require, character.only = TRUE)
<environment: namespace:roxygen2>

所以執行了 load_pkg_dependencies() 後,我們就可以在其他套件載入的 ENV 中進行開發了。

Read more

Package Versioning

The content originates from Xie’s blog

Here are some consistent rules helping us to make version numbers comprehensible if we should modify my versin numbers.

Format: major.minor.patch

  • Major is incremented when the release contains breaking changes, all other numbers are set to 0.
  • Minor is incremented when the release contains new non-breaking features, patch is set to 0.
  • Patch is incremented when the release only contains bugfixes and very minor/trivial features considered necessary.

The dev notation refers to the next release, i.e.: 5.0.0-dev is the development version leading to 5.0.0.

For package development:

  1. a version number is of the form major.minor.patch (x.y.z), e.g., 0.1.7
  2. only the version x.y is released to CRAN
  3. x.y.z is always the development version, and each time a new feature or a bug fix or a __change __is introduced, bump the patch version, e.g., from 0.1.3 to 0.1.4
  4. when one feels it is time to release to CRAN, bump the minor version, e.g., from 0.1 to 0.2
  5. when a change is crazy enough that many users are presumably going to yell at you (see the illustration above), it is time to bump the major version, e.g., from 0.18 to 1.0
  6. the version 1.0 does not imply maturity; it is just because it is potentially very different from 0.x (such as API changes); same thing applies to 2.0 vs 1.0

Read more