WARNING: this is a WIP doc. It might not see the light of a new edit. We'll see if I need a nas
This isntructions and information are tuned for my needs. There is no attempt to have a omnibus guide or a step-by-step one. Therefore, please have a look to what are my needs from a NAS and see if these align with yours before read!
- Backup: backupping the phone data and computer files
- Great attention to Photos!
- RAID 1 config. for photos and files, i.e. most important data. "One copy is not a copy" - CGPGrey
- Plex server: being able to stream movies
- Immich: being able to stream photos with metadata
RAID is a configuration for multiple disks. RAID 0 is a configuration where the data is split between the disks. This means that if one disk fails, all data is lost. RAID 1 is a configuration where the data is mirrored between the disks. This means that if one disk fails, the data is still available in the other disk.
An inportant note is that RAID is not backup. RAID is a scheme that can help you backup you data, but it does not garanties you the safety of your data. For example, you could have 2 SSDs in RAID 1 to have a mirror image of you data. In the case of a double disk failure (which can happens due to environmental factors such as a fire, but also with the EOF of both disks, see the disk section), you would lose all your data.
A software suite that allows to share files in the local network, especially with Windows machines.
- Purpose:
- LXC: More suitable for scenarios where a complete operating system environment with isolation is desired.
- Docker: Focuses on the development, deployment, and scaling of applications.
- Isolation:
- LXC provides system-level isolation similar to traditional virtualization.
- Docker offers application-level isolation, making it easier to share resources between containers. In summary, Docker can be taught as built upon LXC.
ZFS (Zettabyte File System) is a high-performance and highly scalable file system and logical volume manager. It is designed to provide robust data protection, efficient data storage, and powerful data management features.
- Data Integrity:
- End-to-End Checksumming: ZFS uses checksums to detect and correct silent data corruption, ensuring data integrity from disk to disk.
- Self-Healing: Automatically detects and repairs data corruption using redundant data copies.
- Snapshots and Clones:
- Snapshots: ZFS allows the creation of read-only snapshots of the file system at any point in time, useful for backups and data protection.
- Clones: Writable copies of snapshots can be created without consuming additional storage space initially, enabling efficient data replication and testing.
- Data Compression:
- ZFS supports built-in data compression algorithms, allowing for reduced storage space usage and potentially increased performance due to reduced I/O.
- RAID-Z:
- Integrated RAID: Unlike traditional RAID setups, ZFS includes its own RAID implementation called RAID-Z, offering better protection against data loss.
- No Write Hole: RAID-Z eliminates the RAID write hole issue, where data can become inconsistent if a system crash occurs during write operations.
- Copy-on-Write (COW):
- Atomic Transactions: ZFS uses a copy-on-write mechanism, which ensures that updates to data are made atomically, preventing data corruption in case of a system failure during a write operation.
Backup strategy for services: MorroLinux video in Italian
Singleboard computer: Zimaboard Pros:
- x84! easly expanable without aiting for a ARM compatible hw
- Low consumption
- Great connections for expantion
- 1 PCIe 2.4 x 4 -> one can attach new hardware (NVme, ethernet, etc.)
- 2 Sata 6.0 Gb
- 2 LAN Gigabit
- 1 mini-DisplayPort
Why not a Raspberry Pi?
- ARM architecture
- less I/O connections
- less expandable (no PCIe)
Similar to Zimboard but cheaper. Cons:
- More consumption on the CPU side
- dissipation on the BOTTOM part
The one coming with the Zima. Morro: "not ready because it requires a lot of CLI interaction".
- every user can access (fully! hence 777) files. See this issue an the workarounds in it.
Debian-based distribution.
Debian-based distribution that allows for virtualizations (containers LXC and VMs). Pros:
- allows for snapshot for each containers, hence saving its state
Installation: 0. config BIOS to enable virtualization: turn on intel virtualization and VT-d
- select zfs in RAID 1 (mirroring)
- install the system
Configurations:
- Create container LXC in the first node
- template: use the guide provided in the tab (must be connected to the internet)
- Storage: one can add more disks
- RAM: 2Gb
- Static IP (IPv4): 192.168.1.id_container/CIDR and gateway
- Enable start at boot (on the option)
Self-hosting for photos. Setup: run in a container in proxmox, with expandable (virtual) storage. Pros:
- fast
- map for photos with geo metadata
- possibility to have a backup from the phone directly. Possiility to have a photo on the NAS only, on the phone only, or on both.
The installation is quite straight forard, use docker compose
.
Cons:
- do not put it on the internet, becaue everything is clear and it uses
http
. If you want to access it, setup a VPN.
General tip: do not buy the same type of disks, from the same loot, and at the same time. The chances of having both disks failing at the same time are higher.
-
HD Avoid SMR-types of HD. These are quite slow in case of a RAID reconstruction (up to 1 week per TeraByte).
Usually, HD are louder, consume more and are slower than SSDs. For the latter, keep an eye on what is the real bottleneck of the setup, because it might be the internet connection rather than the R/W speed.
Generally are cheaper, but high-end ones does not seem significantly more convenient than SSDs.
- WD Red Plus 4TB
-
SSD
Watch out for the TbW (TeraBytes Written) value. This is the amount of data that can be written to the disk before it starts to fail. Some types of RAID can cause a lot of undesired writes: "Beware of using parity RAIDs with SSDs as they cause "write amplification" - updating a small fragment of data requires updating lots of metadata, so can cause lots of unexpected writes." source. In this regard, modern file-systems are SSD-aware and configuration such as RAID 1 or 10 minimized such write amplification. You should keep an eye to those processes that creates a lot of logs, those can be detrimental for the SSD.
Another aspect is the number of bits per cell used, that can change the endurance and the performance of the SSD, see the multi-level cell
- Samsung 870 QVO
- Video series from MorroLinux (in italian)
- Proxmox and RAID (italian)
- MergerFS + SnapRAID (Italian)
- Good HD buy guide from the DataHoader subreddit
- Transcoding
- A nice playlist about self-hosting is curated in Wolfgang's channel. He is very clear and concise