Code
# install.packages(c("tidyverse", "tiff", "magick", "stringr", "rstudioapi"))The Tag Image File Format (TIFF) is the standard for scientific imaging (microscopy, astronomy, geospatial). Unlike standard photos, TIFFs can hold high bit-depths and multi-page stacks.
Validate scientific integrity and technical metadata. Our objective is to detect hidden lossy compression (e.g., JPEG inside TIFF), inventory multi-page stacks, and extract embedded metadata (ImageJ tags) that are crucial for pixel-based analysis.
Hidden lossy compression and the loss of proprietary metadata during format conversion are primary risks. Scientific TIFFs often appear “black” or empty in standard viewers, leading to accidental deletion or incorrect processing if not properly inspected.
This notebook employs a multi-strategy approach to inspection:
We use the ijtiff package, which is designed for scientific image handling and correctly processes multi-page and high-bit-depth files.
If you do not have the required packages, run this command once in your R console:
# install.packages(c("tidyverse", "tiff", "magick", "stringr", "rstudioapi"))library(tidyverse)
library(tiff) # Raw LibTIFF
library(magick) # ImageMagick
library(stringr) # Text Mining
library(rstudioapi) # UI interactionSelect the folder containing the TiFF files.
Note: If running interactively, a dialog box will appear. Otherwise, it defaults to the target_dir parameter.
if (interactive() && .Platform$OS.type == "windows") {
selected_dir <- rstudioapi::selectDirectory(caption = "Select TIFF Directory")
} else {
selected_dir <- NULL
}
if (!is.null(selected_dir)) {
target_dir <- selected_dir
} else {
target_dir <- params$target_dir
}
print(paste("Analyzing directory:", target_dir))[1] "Analyzing directory: data/Inspect_tiff/"
We scan for .tif and .tiff files.
tiff_files <- list.files(
path = target_dir,
pattern = "\\.tiff?$",
recursive = TRUE,
full.names = TRUE,
ignore.case = TRUE
)
print(paste("Found", length(tiff_files), "TIFF files."))[1] "Found 6 TIFF files."
This function scans all file attributes for keywords (e.g., “depth”, “compression”) to find metadata even if it uses non-standard tag names.
message("Generating Deep Inspection Report...")
inspect_tiff_fuzzy <- function(fp) {
fname <- basename(fp)
tryCatch({
# Read Image Header
img <- image_read(fp)
info <- image_info(img)
attrs <- image_attributes(img)
# Fuzzy Extraction: Bit Depth
# Find ANY attribute containing "depth" or "bits" (case insensitive)
depth_check <- attrs %>%
filter(grepl("depth|bits", property, ignore.case = TRUE)) %>%
pull(value)
# Logic: Use found attribute, or fallback to info$depth, or "Unknown"
final_depth <- if(length(depth_check) > 0) {
paste(unique(depth_check), collapse = "/")
} else if ("depth" %in% names(info)) {
as.character(info$depth[1])
} else {
"Unknown"
}
# Fuzzy Extraction: Compression
# Find ANY attribute containing "compression"
comp_check <- attrs %>%
filter(grepl("compression", property, ignore.case = TRUE)) %>%
pull(value)
final_comp <- if(length(comp_check) > 0) comp_check[1] else "Unknown"
# Fuzzy Extraction: Resolution
res_check <- attrs %>%
filter(grepl("resolution|density", property, ignore.case = TRUE)) %>%
pull(value)
final_res <- if(length(res_check) > 0) paste(res_check[1], "DPI") else paste(info$density[1], "DPI")
# Build Row
tibble(
FileName = fname,
Dimensions = paste(info$width[1], "x", info$height[1]),
BitDepth = final_depth,
Compression = final_comp,
ColorSpace = info$colorspace[1],
Resolution = final_res,
FileSize_MB = round(file.size(fp) / 1024^2, 2),
Status = "Success"
)
}, error = function(e) {
# Error Handling
message(paste("Failed on:", fname, "-", e$message))
tibble(
FileName = fname, Dimensions = NA, BitDepth = NA, Compression = NA,
ColorSpace = NA, Resolution = NA, FileSize_MB = NA,
Status = paste("Error:", e$message)
)
})
}
# Execute Analysis
report <- map_dfr(tiff_files, inspect_tiff_fuzzy)
# Display
print("--- Deep Inspection Preview ---")[1] "--- Deep Inspection Preview ---"
print(head(report))# A tibble: 6 × 8
FileName Dimensions BitDepth Compression ColorSpace Resolution FileSize_MB
<chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 KL27_14D_KO… 1936 x 14… Unknown Unknown Gray 2x2 DPI 3.59
2 Td014_7D_MC… 2048 x 20… Unknown Unknown Gray 1x1 DPI 4
3 Td014_7D_MC… 2048 x 20… Unknown Unknown Gray 26x26 DPI 4
4 Td62_7D_MCA… 2048 x 20… Unknown Unknown Gray 6x6 DPI 4
5 Td63_7D_MCA… 2048 x 20… Unknown Unknown Gray 6x6 DPI 4
6 Td78_3D_MCA… 2048 x 20… Unknown Unknown Gray 6x6 DPI 4
# ℹ 1 more variable: Status <chr>
Save the report to a CSV file for review.
output_dir <- "Results/Inspect_tiff"
dir.create(output_dir, recursive = TRUE, showWarnings = FALSE)
output_file <- file.path(output_dir, paste0("TIFF_DeepScan_", format(Sys.Date(), "%Y%m%d"), ".csv"))
write.csv(report, output_file, row.names = FALSE)
print(paste("Report saved to:", output_file))[1] "Report saved to: Results/Inspect_tiff/TIFF_DeepScan_20260515.csv"
Use this report to verify scientific integrity:
Bit Depth (16 or 32): These are true scientific images. Do not convert them to JPG or PNG, as those formats are typically limited to 8-bit and will discard data precision. If BitDepth remains “Unknown”, the file likely lacks a proper header.
Compression: Recommended (Uncompressed, LZW, Deflate, Packbits); Not recommended (JPEG, OldJPEG). If you see JPEG compression inside a TIFF, the data integrity for pixel-based analysis is compromised.
Color Space (Gray vs sRGB): Scientific microscopy data is typically Gray (intensity only). If a raw microscope file is sRGB, it may have been converted or screenshotted, potentially losing its dynamic range.
ImageJ / Fiji: The absolute standard open-source software for viewing and analyzing scientific TIFFs (microscopy, astronomy). It handles multi-page stacks and high-bit-depth data that confuse standard viewers (Schindelin et al. 2012).
Bio-Formats: A library specifically designed to read proprietary life sciences image formats (e.g., .czi, .lif) and convert them to standard OME-TIFF (see the Bio-Formats documentation).
ExifTool: A robust command-line tool for reading and writing embedded metadata (EXIF, IPTC, XMP) in images (see the documentation).
For users who want to run this analysis on a server (HPC), in a batch job, or from the command line, here is the pure R script version.
Download the R Script: Inspect_tiff_Script.R
Inspect_tiff_submit.sh)#!/bin/bash
#SBATCH --job-name=tiff_check
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:30:00
#SBATCH --mem=16G
#SBATCH --output=logs/tiff_check_%j.log
# Load R Module
# Note: ImageMagick is often a system library, but some clusters require a module.
# Check with 'module avail imagemagick' if the script fails.
module load R
module load imagemagick/7.1 # Adjust version as needed for your cluster
# Define Target Directory
# Replace with the actual path to your microscopy/image data
DATA_DIR="/scratch/user/project_data/microscopy"
# Prepare Environment
mkdir -p Results/Inspect_tiff
mkdir -p logs
# Run Analysis
echo "Starting TIFF Deep Scan on $DATA_DIR"
Rscript Inspect_tiff_Script.R "$DATA_DIR"