Blog Archive

Monday, February 17, 2025

How to set up remote ssh without password from VSCode

 You can set up passwordless SSH access from VSCode by using SSH keys. Here’s how:

  1. Generate an SSH key pair (if you don’t have one):
    Open your terminal and run:

    ssh-keygen -t rsa -b 4096

    Follow the prompts. If you want a completely passwordless experience, leave the passphrase empty.

  2. Copy your public key to the remote server:
    Use the ssh-copy-id command: (you need replace with your own info)

    ssh-copy-id username@remote_server_address

    This adds your public key (typically ~/.ssh/id_rsa.pub) to the ~/.ssh/authorized_keys file on the remote machine.

  3. Verify remote SSH settings:
    Ensure the remote server’s SSH configuration (usually in /etc/ssh/sshd_config) has PubkeyAuthentication yes enabled, and that the ~/.ssh directory and authorized_keys file have the correct permissions.

  4. Configure your SSH client (optional but useful):
    Edit (or create) the ~/.ssh/config  (it is on C:\Users\<YourUsername>\.ssh in windows OS) file on your local machine with an entry like:

    Host my-remote HostName remote_server_address User username IdentityFile ~/.ssh/id_rsa

    This simplifies the connection process.

  5. Connect using VSCode’s Remote - SSH extension:

    • Install the Remote - SSH extension in VSCode if you haven’t already.
    • Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P) and type “Remote-SSH: Connect to Host…”.
    • Select the host (either from your SSH config or enter it manually).

VSCode will use your SSH key for authentication, allowing you to log in without entering a password.


FAQ:

Q1) Where to find ~/.ssh in windows OS?

A1) On Windows, the equivalent of the ~/.ssh folder is located in your user profile directory, typically at:    

C:\Users\<YourUsername>\.ssh

Q2) How to solve?

ssh-copy-id username@remote_server_address ssh-copy-id : The term 'ssh-copy-id' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again. At line:1 char:1 + ssh-copy-id username@remote_server_address + ~~~~~~~~~~~ + CategoryInfo : ObjectNotFound: (ssh-copy-id:String) [], CommandNotFoundException + FullyQualifiedErrorId : CommandNotFoundException

A2: (replace username and remote_server_address with your own case)
type $env:USERPROFILE\.ssh\id_rsa.pub | ssh username@remote_server_address "mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys"

Saturday, February 1, 2025

Tutorial: load(s) and dump(s) for Json

 

>>> help(json.load)

load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

    Deserialize ``fp`` (a ``.read()``-supporting file-like object containing a JSON document) to a Python object.                

 

help(json.loads)

loads(s, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

    Deserialize ``s`` (a ``str``, ``bytes`` or ``bytearray`` instance containing a JSON document) to a Python object.

 


Example using json.load

Assume you have a file named data.json with the following content:

{

  "name": "Alice",

  "age": 25,

  "city": "NY"

}

You can load this JSON data into a Python dictionary as follows:

import json

# Open the JSON file and parse its content

with open('data.json', 'r') as file:

    data = json.load(file)

print(data)

# Output:{'name': 'Alice', 'age': 25, 'city': 'NY'}

 

Example using json.loads

Suppose you have a JSON string:

import json

json_string='{"name":"Bob", "age":30,"city": "NY"}'

# Parse the JSON string into a Python dictionary

data = json.loads(json_string)

print(data)

# Output: {'name': 'Bob', 'age': 30, 'city': 'NY'}

 

Example:

1. Saving a Dictionary to a JSON File

You can use the json.dump function to write a dictionary to a file in JSON format. This method writes the data directly to the file in a structured JSON format, which is easy to read back later using json.load.

 

import json

 

data = {

    "name": "Alice",

    "age": 25,

    "city": "Wonderland"

}

 

# Open a file in write mode and dump the dictionary as JSON

with open('data.json', 'w') as json_file:

    json.dump(data, json_file, indent=4)  # 'indent=4' makes the output nicely formatted

 

 

2. Saving a Dictionary to a Text File as a String

There are two common approaches depending on what string format you need:

 

Method A: Using str()

This method writes the Python dictionary's string representation to the file. Note that this representation is not valid JSON but rather the standard Python dictionary format.

 

data = {

    "name": "Alice",

    "age": 25,

    "city": "Wonderland"

}

 

# Open a text file in write mode and write the dictionary as a string

with open('data.txt', 'w') as text_file:

    text_file.write(str(data))

 

The content of data.txt will look like:

{'name': 'Alice', 'age': 25, 'city': 'Wonderland'}

 

 

Method B: Using json.dumps()

If you prefer to store the dictionary as a JSON-formatted string in a text file (which is both human-readable and can be easily parsed later), you can convert it to a string first with json.dumps and then write it.

import json

data = {

    "name": "Alice",

    "age": 25,

    "city": "Wonderland"

}

 

# Convert the dictionary to a JSON formatted string

json_string = json.dumps(data, indent=4)

# Open a text file in write mode and write the JSON string

with open('data.txt', 'w') as text_file:

    text_file.write(json_string)

The resulting data.txt will contain:

 

{

    "name": "Alice",

    "age": 25,

    "city": "Wonderland"

}

Summary

json.dump: Directly writes a dictionary to a file as JSON.

str(): Converts the dictionary to its Python string representation.

json.dumps: Converts the dictionary to a JSON formatted string, which can then be written to a text file.

Choose the method that best fits your needs for later reading or processing the file.

 


 

 

 

 


help me fix: >>> sx

"{'name': 'Bob', 'age': 30, 'city': 'Builderland'}"

>>> sd = json.loads(sx)

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2288.0_x64__qbz5n2kfra8p0\Lib\json\__init__.py", line 346, in loads

    return _default_decoder.decode(s)

           ^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2288.0_x64__qbz5n2kfra8p0\Lib\json\decoder.py", line 338, in decode

    obj, end = self.raw_decode(s, idx=_w(s, 0).end())

               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.2288.0_x64__qbz5n2kfra8p0\Lib\json\decoder.py", line 354, in raw_decode

    obj, end = self.scan_once(s, idx)

               ^^^^^^^^^^^^^^^^^^^^^^

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

 

GPT:

The error occurs because JSON requires double quotes (") for both keys and string values, but your string uses single quotes ('). Here are two ways to fix this:

 

1. Convert the String to Valid JSON

If you can change the string to proper JSON format, use double quotes. For example:

import json

# Valid JSON string with double quotes

sx = '{"name": "Bob", "age": 30, "city": "Builderland"}'

sd = json.loads(sx)

print(sd)

 

2. Use ast.literal_eval for a Python Literal

If you have a string in Python's dictionary format (using single quotes) and you want to convert it into a dictionary, you can use the ast.literal_eval function:

import ast

sx = "{'name': 'Bob', 'age': 30, 'city': 'Builderland'}"

sd = ast.literal_eval(sx) # sd is a dict

print(sd)

This will also output:

#{'name': 'Bob', 'age': 30, 'city': 'Builderland'}

 

str(sd) == sx # return True

 

 

Please comment the following two strings from the perspetive of JSON read and write. which one is preferred?

string1='{"name":"Bob", "age":30,"city": "NY"}'  

string2="{'name': 'Bob', 'age': 30, 'city': 'NY'}"

 

Answer: string1 is preferred and valid JSON format.

 

Best Practice:

Always use json.dumps() to generate JSON strings from Python dictionaries, as it will ensure the proper formatting (i.e., using double quotes).

Friday, December 6, 2024

what is SFT, PPO, and DPO

In the context of reinforcement learning (RL) and fine-tuning, SFT (Supervised Fine-Tuning), PPO (Proximal Policy Optimization), and DPO (Direct Preference Optimization) are approaches used for training large language models (LLMs). Here's their relationship and role:


1. SFT (Supervised Fine-Tuning)

  • What it is: SFT is the process of fine-tuning a pre-trained language model on labeled datasets. The model is trained to predict the correct output (e.g., a response or classification) given an input.
  • Role: This step establishes a baseline model that learns task-specific patterns from curated datasets.
  • Relation to RL: SFT is typically used as a precursor to RL-based methods like PPO or DPO. While SFT relies on explicit supervision, RL-based methods rely on feedback signals.

2. PPO (Proximal Policy Optimization)

  • What it is: PPO is an RL algorithm that optimizes a policy using rewards. In the context of LLMs, it is used in RLHF (Reinforcement Learning with Human Feedback) to align models with human preferences.
  • How it works:
    • A reward model (often trained on human preference data) provides feedback on the quality of model outputs.
    • PPO adjusts the model to maximize these rewards while maintaining the stability of updates (ensuring the policy does not diverge too far from the original SFT model).
  • Relation to SFT: PPO fine-tunes the SFT model further by incorporating reward signals to improve alignment with human preferences.

3. DPO (Direct Preference Optimization)

  • What it is: DPO is a method designed to align models directly with preference data without requiring a reward model or RL algorithms like PPO. It uses preference pairs (e.g., output A is preferred over B) to optimize the model.
  • How it works:
    • Instead of learning a separate reward function, DPO directly optimizes the model to generate preferred outputs based on preference labels.
    • It avoids the complexities of RL (e.g., policy constraints in PPO).
  • Relation to PPO:
    • Both aim to align models with human preferences.
    • DPO is simpler and more efficient but may not achieve the same level of performance in complex scenarios.
  • Relation to SFT: Like PPO, DPO starts from an SFT model and fine-tunes it further using preference data.

Summary of Relationships

  • SFT: Foundation, establishes a baseline model for downstream fine-tuning.
  • PPO: Uses RL to improve alignment by optimizing a reward function derived from preferences.
  • DPO: Simplifies preference optimization, directly aligning the model with preference data without the need for a reward model or traditional RL.

Together, these methods form a pipeline where SFT provides a task-specific base, and PPO or DPO refine it for alignment and preference optimization.

Wednesday, October 2, 2024

如何在中国使用Fidelity Cash Management VISA Debit card


Fidelity Cash Management VISA Debit card  

申请链接:

https://fidelity.app.link/e/rGWgIGAznNb


特色

  1. 全球任何一家ATM取钱都免手续费!!有时ATM那边也会收取一定手续费,但是 Fidelity 会给报销掉这部分费用,也就是真的保证了一点手续费都没有!!
  2. 存的是美元,取款时取出的是当地货币,自动会按照当天汇率计算,非常方便!
  3. 零月费,没有最低存款要求。
  4. $0 incoming wire transfer fee,国内和国际汇款都是如此。
  5. $0 outgoing wire transfer fee,国内和国际汇款都是如此!这一特性相当少见,其他银行outgoing wire fee基本上都要$25-$30的!
  6. 你将可以选择把这个账户里的余额自动存入货币基金SPAXX(2024年4月写文时利率高达4.95%,这个利率基本上就是随着美联储的利率而变)。在这之前此账户的利率只有弱鸡的2.x%。SPAXX的利率已经在各种活期存款账户里算是最能打的之一了,虽然以前也能在Fidelity的brokerage账户里买到,但是毕竟checking账户用起来更方便,增加了这个feature之后让Fidelity这个账户吸引力大增!
  7. 此卡是 Visa 卡,几乎全世界的 ATM 都可以刷。

缺点

  1. 每日ATM取钱限额:$500。
  2. 第一次往里转钱可能会很麻烦,关联银行的话可能需要等一封纸质的信验证你的地址信息,所以一定要留有提前量不要临出国了再办结果最后来不及往里转钱。
  3. 往里转钱可能最终会需要3~5个工作日才能取(其实所有银行都这样 到账需要一定时间),所以出国前一定要规划好留有提前量。

中国ATM机器上使用注意事项

(2024年10月1日最新验证)
找到又VISA标志的ATM机器(比如:建行),标识有“可受理外卡取款”,插卡输密码(密码是4位数字pin码),输完4位密码直接确认就好,语言选择中文或者中英文对照都可以,账户选择“默认账户”,然后选择“取款”(不要查询,查询会报错,报错后就会自动退款),输入要取的金额,不能超过卡内可取金额,否则会失败,取款无论成功与否,持卡人都会收到提示短信


Charles Schwab Checking 银行账户:全球ATM无手续费取现的 Debit Card

操作步骤:

开设账户:schwab 并同时申请 Debit Card.

https://www.schwab.com/client-referral?refrid=REFERVBDKWAV2

开户送1000美元,且能获得在中国的ATM上无手续费去人民币现金


特色

  1. (开户奖励实际上是跟着 Brokerage Account 走的,要求存入的金额必须存入 Brokerage Account 而放在 Checking Account 的钱不计入,因此请移步至 Charles Schwab Brokerage Account 查看。)
  2. 全球任何一家ATM取钱都免手续费!有时ATM那边也会收取一定手续费,但是 Charles Schwab 会在月末时给报销掉这部分费用,也就是真的保证了一点手续费都没有!!
  3. 存的是美元,取款时取出的是当地货币,自动会按照当天汇率计算,非常方便!
  4. $0 月费,没有最低存款要求。
  5. $0 incoming wire transfer fee,适合接从国内打过来的汇款。
  6. 此卡是 Visa 卡,几乎全世界的ATM都可以刷。

缺点

  1. 申请这个账户会在 Equifax (EQ) 留下一个 Hard Pull (HP)。(Brokerage+checking 只有一个 HP。)【更新】根据最新的数据点,开Schwab的账户终于不再有hard pull了!
  2. 每日ATM取钱限额:$1000。
  3. 往里转钱可能最终会需要3~5个工作日才能取(其实所有银行都这样 到账需要一定时间),所以出国前一定要规划好留有提前量。
  4. 这个 Checking Account 是和其投资账户关联的,所以开户时必须也同时开一个投资账户。不过都是免费的,所以就开了就好了。

参考信息:
另外一张广受欢迎的卡是: Fidelity Cash Management VISA Debit card
具体申请和使用请参见:

Here is free $30 for your food

Here is the link:  

https://drd.sh/uzcj2Xh8rlu0ByUo


When you use the above link to sign up the DoorDash account, you can get $30 off ($10 off each of your first 3 orders).


And I will also get $20 in credits for my food. Thanks.


Monday, September 16, 2024

[paper+code] Conformer and Zipformer for ASR

Efficient conformer-based speech recognition with linear attention

Paper: https://arxiv.org/pdf/2104.06865

code: https://github.com/Alex2135/ASR-proto?tab=readme-ov-file


Zipformer: A faster and better encoder for automatic speech recognition

https://arxiv.org/abs/2310.11230

code: https://github.com/k2-fsa/icefall