Below is a comprehensive, practical process to automatically track your most-used files on Manjaro Linux and sync those under 1 MB to a USB drive when it’s plugged in, filling the USB to ~80 % capacity. This solution uses common Linux tools, udev for USB detection, scripting for file tracking, and rsync for copying.
Overview of the Solution
- Track file usage (access frequency) on your system.
- Maintain a ranked list of most-used files.
- On USB insertion, compute a target 80 % capacity size.
- Select top files under 1 MB from the usage list until the target capacity.
- Copy those files to the USB.
Components
- File usage tracker → logs accessed files.
- Usage database → tracks frequency and last access times.
- udev rule → triggers sync on USB mount.
- Sync script → selects and copies files to USB.
Assumptions
- You are on Manjaro Linux (Arch-based).
- You have bash, inotifywait (from
inotify-tools), rsync, and standard coreutils. - USB mountpoints are under
/run/media/$USER/<label>(common on Manjaro with udisks2/GUI auto-mounting). If you auto-mount elsewhere, you can adjust.
Part 1 — Track File Access
We want a daemon that logs the files you use. The simplest reliable metric is file opens.
- Install required tool
sudo pacman -S inotify-tools
- Create a tracker script
Create /usr/local/bin/file_usage_tracker.sh:
#!/usr/bin/env bash
# Scan common directories to watch — adjust to your needs.
WATCH_DIRS=("$HOME" "/etc" "/usr/local/bin")
# DB file
DB="$HOME/.file_usage.db"
mkdir -p "$(dirname "$DB")"
touch "$DB"
# Function logs file open events
inotifywait -m -e open --format '%w%f' "${WATCH_DIRS[@]}" | while read path; do
# Only record real files
if [ -f "$path" ]; then
# Increase count with timestamp
echo "$(date +%s) $path" >> "$DB"
fi
done
- Make it executable
sudo chmod +x /usr/local/bin/file_usage_tracker.sh
- Run it on login
Use a systemd user service:
~/.config/systemd/user/file_usage_tracker.service
[Unit]
Description=Track file opens
[Service]
ExecStart=/usr/local/bin/file_usage_tracker.sh
Restart=always
[Install]
WantedBy=default.target
Enable it:
systemctl --user daemon-reload
systemctl --user enable --now file_usage_tracker.service
This now appends every file open to a per-user DB (simple flat log). We will process it later.
Part 2 — Create a Ranked File List
We must convert the raw log into a frequency list of files, excluding those >1 MB.
Create /usr/local/bin/file_usage_rank.sh:
#!/usr/bin/env bash
DB="$HOME/.file_usage.db"
RANKED="$HOME/.file_usage_ranked.tsv"
# Remove non-existent paths
grep -v -E '^\s*$' "$DB" | awk '{print $2}' | sort | uniq -c | \
sort -nr > /tmp/usage.tmp
# Emit TSV: count<TAB>path
awk '{printf "%d\t%s\n", $1, $2}' /tmp/usage.tmp > "$RANKED"
rm /tmp/usage.tmp
Make executable:
sudo chmod +x /usr/local/bin/file_usage_rank.sh
You can run this periodically (e.g., daily cron or systemd timer) so the ranked list stays up to date.
Part 3 — USB Sync Script
This script is triggered on USB insertion.
Save as /usr/local/bin/usb_sync_most_used.sh:
#!/usr/bin/env bash
# Mount point argument
MOUNTPOINT="$1"
USER_HOME="$HOME"
# Location of ranked file list
RANKED="$USER_HOME/.file_usage_ranked.tsv"
TARGET_DIR="$MOUNTPOINT/most_used_files"
# Fail if missing
[ -f "$RANKED" ] || exit 1
# Compute target size (80%)
TOTAL_BYTES=$(df --output=size -B1 "$MOUNTPOINT" | tail -n1)
TARGET_BYTES=$(( TOTAL_BYTES * 80 / 100 ))
# Prepare
mkdir -p "$TARGET_DIR"
rm -rf "${TARGET_DIR:?}/"* # clear old
ACCUM=0
# Select files
while IFS=$'\t' read -r count path; do
# stop if target reached
[ "$ACCUM" -ge "$TARGET_BYTES" ] && break
# skip if >1MB or missing/not regular
if [ -f "$path" ] && [ "$(stat -c%s "$path")" -le 1048576 ]; then
FILESIZE=$(stat -c%s "$path")
ACCUM=$((ACCUM + FILESIZE))
echo "Queue $path ($FILESIZE bytes)"
echo "$path"
fi
done < "$RANKED" | while read file; do
# use rsync to copy while preserving structure
REL="${file#$HOME/}"
DEST="$TARGET_DIR/$REL"
mkdir -p "$(dirname "$DEST")"
rsync -a --relative "$file" "$TARGET_DIR"
done
Make executable:
sudo chmod +x /usr/local/bin/usb_sync_most_used.sh
Part 4 — udev Rule to Trigger Sync
You want the script to run when a USB is plugged in and mounted. Writing udev directly for sync is fragile because mount may not be ready.
Better: use a udev rule that invokes a systemd service once the block device appears.
- Create a udev rule:
/etc/udev/rules.d/99-usb-sync.rules
ACTION=="add", SUBSYSTEM=="block", ENV{ID_FS_TYPE}!="", RUN+="/usr/bin/systemd run usb_sync@%k.service"
- Create a systemd template:
/etc/systemd/system/usb_sync@.service
[Unit]
Description=Sync Most Used Files for USB %I
After=local-fs.target
[Service]
Type=oneshot
Environment="MOUNTDEV=%I"
ExecStart=/usr/local/bin/usb_sync_udev_wrapper.sh "%I"
- Create the wrapper to find mountpoint:
/usr/local/bin/usb_sync_udev_wrapper.sh
#!/usr/bin/env bash
DEVNAME="$1"
# Wait up to 10s for mount
for i in {1..10}; do
MOUNT=$(lsblk -o MOUNTPOINT -nr /dev/"$DEVNAME" | head -n1)
[ -n "$MOUNT" ] && break
sleep 1
done
[ -n "$MOUNT" ] && /usr/local/bin/usb_sync_most_used.sh "$MOUNT"
Make exec:
sudo chmod +x /usr/local/bin/usb_sync_udev_wrapper.sh
- Reload:
sudo udevadm control --reload
sudo systemctl daemon-reload
How It Works
-
The tracker logs all file opens.
-
The rank script builds a sorted list by usage count.
-
When any USB block device is plugged in:
- The udev rule triggers a systemd run service.
- The wrapper waits until the device is mounted.
- The sync script reads the ranked list, selects files ≤1 MB and copies them up to ~80 % of USB capacity.
Optional Improvements
- Exclude certain directories from tracking (e.g.,
/proc, caches). - Blacklist file types (e.g., temp or large binaries).
- Exclude duplicates by content hash.
- Add logging for audit and error tracking.
Notes
- This approach features a simple access tracker rather than kernel tracing.
- The sync happens for any USB filesystem with an ID_FS_TYPE, so you can whitelist by vendor ID if needed.
- Ensure your tracker doesn’t impact performance by adjusting watch dirs.